A significant number of hotel bookings are called-off due to cancellations or no-shows. The typical reasons for cancellations include change of plans, scheduling conflicts, etc. This is often made easier by the option to do so free of charge or preferably at a low cost which is beneficial to hotel guests but it is a less desirable and possibly revenue-diminishing factor for hotels to deal with. Such losses are particularly high on last-minute cancellations.
The new technologies involving online booking channels have dramatically changed customers’ booking possibilities and behavior. This adds a further dimension to the challenge of how hotels handle cancellations, which are no longer limited to traditional booking and guest characteristics.
The cancellation of bookings impact a hotel on various fronts:
The increasing number of cancellations calls for a Machine Learning based solution that can help in predicting which booking is likely to be canceled. INN Hotels Group has a chain of hotels in Portugal, they are facing problems with the high number of booking cancellations and have reached out to your firm for data-driven solutions. You as a data scientist have to analyze the data provided to find which factors have a high influence on booking cancellations, build a predictive model that can predict which booking is going to be canceled in advance, and help in formulating profitable policies for cancellations and refunds.
The data contains the different attributes of customers' booking details. The detailed data dictionary is given below.
Data Dictionary
# this will help in making the Python code more structured automatically (good coding practice)
%load_ext nb_black
import warnings
warnings.filterwarnings("ignore")
from statsmodels.tools.sm_exceptions import ConvergenceWarning
warnings.simplefilter("ignore", ConvergenceWarning)
# Libraries to help with reading and manipulating data
import pandas as pd
import numpy as np
# Library to split data
from sklearn.model_selection import train_test_split
# libaries to help with data visualization
import matplotlib.pyplot as plt
import seaborn as sns
# Removes the limit for the number of displayed columns
pd.set_option("display.max_columns", None)
# Sets the limit for the number of displayed rows
pd.set_option("display.max_rows", 200)
# To build model for prediction
import statsmodels.stats.api as sms
from statsmodels.stats.outliers_influence import variance_inflation_factor
import statsmodels.api as sm
from statsmodels.tools.tools import add_constant
from sklearn.linear_model import LogisticRegression
# To get diferent metric scores
from sklearn.metrics import (
f1_score,
accuracy_score,
recall_score,
precision_score,
confusion_matrix,
roc_auc_score,
plot_confusion_matrix,
precision_recall_curve,
roc_curve,
)
The nb_black extension is already loaded. To reload it, use: %reload_ext nb_black
data_hotel = pd.read_csv("INNHotelsGroup.csv")
data1 = data_hotel.copy()
data1.head(5)
| Booking_ID | no_of_adults | no_of_children | no_of_weekend_nights | no_of_week_nights | type_of_meal_plan | required_car_parking_space | room_type_reserved | lead_time | arrival_year | arrival_month | arrival_date | market_segment_type | repeated_guest | no_of_previous_cancellations | no_of_previous_bookings_not_canceled | avg_price_per_room | no_of_special_requests | booking_status | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | INN00001 | 2 | 0 | 1 | 2 | Meal Plan 1 | 0 | Room_Type 1 | 224 | 2017 | 10 | 2 | Offline | 0 | 0 | 0 | 65.00 | 0 | Not_Canceled |
| 1 | INN00002 | 2 | 0 | 2 | 3 | Not Selected | 0 | Room_Type 1 | 5 | 2018 | 11 | 6 | Online | 0 | 0 | 0 | 106.68 | 1 | Not_Canceled |
| 2 | INN00003 | 1 | 0 | 2 | 1 | Meal Plan 1 | 0 | Room_Type 1 | 1 | 2018 | 2 | 28 | Online | 0 | 0 | 0 | 60.00 | 0 | Canceled |
| 3 | INN00004 | 2 | 0 | 0 | 2 | Meal Plan 1 | 0 | Room_Type 1 | 211 | 2018 | 5 | 20 | Online | 0 | 0 | 0 | 100.00 | 0 | Canceled |
| 4 | INN00005 | 2 | 0 | 1 | 1 | Not Selected | 0 | Room_Type 1 | 48 | 2018 | 4 | 11 | Online | 0 | 0 | 0 | 94.50 | 0 | Canceled |
data1.duplicated().sum()
0
data1.shape
(36275, 19)
data1["type_of_meal_plan"] = data1["type_of_meal_plan"].str.replace(" ", "_")
data1["room_type_reserved"] = data1["room_type_reserved"].str.replace(" ", "_")
data1.head(5)
| Booking_ID | no_of_adults | no_of_children | no_of_weekend_nights | no_of_week_nights | type_of_meal_plan | required_car_parking_space | room_type_reserved | lead_time | arrival_year | arrival_month | arrival_date | market_segment_type | repeated_guest | no_of_previous_cancellations | no_of_previous_bookings_not_canceled | avg_price_per_room | no_of_special_requests | booking_status | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | INN00001 | 2 | 0 | 1 | 2 | Meal_Plan_1 | 0 | Room_Type_1 | 224 | 2017 | 10 | 2 | Offline | 0 | 0 | 0 | 65.00 | 0 | Not_Canceled |
| 1 | INN00002 | 2 | 0 | 2 | 3 | Not_Selected | 0 | Room_Type_1 | 5 | 2018 | 11 | 6 | Online | 0 | 0 | 0 | 106.68 | 1 | Not_Canceled |
| 2 | INN00003 | 1 | 0 | 2 | 1 | Meal_Plan_1 | 0 | Room_Type_1 | 1 | 2018 | 2 | 28 | Online | 0 | 0 | 0 | 60.00 | 0 | Canceled |
| 3 | INN00004 | 2 | 0 | 0 | 2 | Meal_Plan_1 | 0 | Room_Type_1 | 211 | 2018 | 5 | 20 | Online | 0 | 0 | 0 | 100.00 | 0 | Canceled |
| 4 | INN00005 | 2 | 0 | 1 | 1 | Not_Selected | 0 | Room_Type_1 | 48 | 2018 | 4 | 11 | Online | 0 | 0 | 0 | 94.50 | 0 | Canceled |
data1.tail(5)
| Booking_ID | no_of_adults | no_of_children | no_of_weekend_nights | no_of_week_nights | type_of_meal_plan | required_car_parking_space | room_type_reserved | lead_time | arrival_year | arrival_month | arrival_date | market_segment_type | repeated_guest | no_of_previous_cancellations | no_of_previous_bookings_not_canceled | avg_price_per_room | no_of_special_requests | booking_status | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 36270 | INN36271 | 3 | 0 | 2 | 6 | Meal_Plan_1 | 0 | Room_Type_4 | 85 | 2018 | 8 | 3 | Online | 0 | 0 | 0 | 167.80 | 1 | Not_Canceled |
| 36271 | INN36272 | 2 | 0 | 1 | 3 | Meal_Plan_1 | 0 | Room_Type_1 | 228 | 2018 | 10 | 17 | Online | 0 | 0 | 0 | 90.95 | 2 | Canceled |
| 36272 | INN36273 | 2 | 0 | 2 | 6 | Meal_Plan_1 | 0 | Room_Type_1 | 148 | 2018 | 7 | 1 | Online | 0 | 0 | 0 | 98.39 | 2 | Not_Canceled |
| 36273 | INN36274 | 2 | 0 | 0 | 3 | Not_Selected | 0 | Room_Type_1 | 63 | 2018 | 4 | 21 | Online | 0 | 0 | 0 | 94.50 | 0 | Canceled |
| 36274 | INN36275 | 2 | 0 | 1 | 2 | Meal_Plan_1 | 0 | Room_Type_1 | 207 | 2018 | 12 | 30 | Offline | 0 | 0 | 0 | 161.67 | 0 | Not_Canceled |
Let's drop Booking_ID column.
data = data1.drop(["Booking_ID"], axis=1)
data.head()
| no_of_adults | no_of_children | no_of_weekend_nights | no_of_week_nights | type_of_meal_plan | required_car_parking_space | room_type_reserved | lead_time | arrival_year | arrival_month | arrival_date | market_segment_type | repeated_guest | no_of_previous_cancellations | no_of_previous_bookings_not_canceled | avg_price_per_room | no_of_special_requests | booking_status | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2 | 0 | 1 | 2 | Meal_Plan_1 | 0 | Room_Type_1 | 224 | 2017 | 10 | 2 | Offline | 0 | 0 | 0 | 65.00 | 0 | Not_Canceled |
| 1 | 2 | 0 | 2 | 3 | Not_Selected | 0 | Room_Type_1 | 5 | 2018 | 11 | 6 | Online | 0 | 0 | 0 | 106.68 | 1 | Not_Canceled |
| 2 | 1 | 0 | 2 | 1 | Meal_Plan_1 | 0 | Room_Type_1 | 1 | 2018 | 2 | 28 | Online | 0 | 0 | 0 | 60.00 | 0 | Canceled |
| 3 | 2 | 0 | 0 | 2 | Meal_Plan_1 | 0 | Room_Type_1 | 211 | 2018 | 5 | 20 | Online | 0 | 0 | 0 | 100.00 | 0 | Canceled |
| 4 | 2 | 0 | 1 | 1 | Not_Selected | 0 | Room_Type_1 | 48 | 2018 | 4 | 11 | Online | 0 | 0 | 0 | 94.50 | 0 | Canceled |
print(f"There are {data.shape[0]} rows and {data.shape[1]} columns.")
There are 36275 rows and 18 columns.
data.sample(5, random_state=2)
| no_of_adults | no_of_children | no_of_weekend_nights | no_of_week_nights | type_of_meal_plan | required_car_parking_space | room_type_reserved | lead_time | arrival_year | arrival_month | arrival_date | market_segment_type | repeated_guest | no_of_previous_cancellations | no_of_previous_bookings_not_canceled | avg_price_per_room | no_of_special_requests | booking_status | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3281 | 2 | 0 | 2 | 3 | Meal_Plan_1 | 0 | Room_Type_1 | 303 | 2018 | 8 | 19 | Offline | 0 | 0 | 0 | 78.0 | 1 | Not_Canceled |
| 27326 | 2 | 0 | 1 | 1 | Not_Selected | 0 | Room_Type_1 | 91 | 2018 | 4 | 9 | Online | 0 | 0 | 0 | 58.9 | 1 | Not_Canceled |
| 22178 | 2 | 0 | 0 | 2 | Not_Selected | 0 | Room_Type_1 | 0 | 2018 | 4 | 7 | Online | 0 | 0 | 0 | 89.0 | 1 | Not_Canceled |
| 16974 | 2 | 0 | 0 | 2 | Meal_Plan_1 | 0 | Room_Type_1 | 93 | 2017 | 7 | 23 | Online | 0 | 0 | 0 | 76.5 | 1 | Canceled |
| 6931 | 2 | 0 | 0 | 1 | Meal_Plan_1 | 0 | Room_Type_4 | 88 | 2018 | 8 | 19 | Online | 0 | 0 | 0 | 131.4 | 2 | Not_Canceled |
data.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 36275 entries, 0 to 36274 Data columns (total 18 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 no_of_adults 36275 non-null int64 1 no_of_children 36275 non-null int64 2 no_of_weekend_nights 36275 non-null int64 3 no_of_week_nights 36275 non-null int64 4 type_of_meal_plan 36275 non-null object 5 required_car_parking_space 36275 non-null int64 6 room_type_reserved 36275 non-null object 7 lead_time 36275 non-null int64 8 arrival_year 36275 non-null int64 9 arrival_month 36275 non-null int64 10 arrival_date 36275 non-null int64 11 market_segment_type 36275 non-null object 12 repeated_guest 36275 non-null int64 13 no_of_previous_cancellations 36275 non-null int64 14 no_of_previous_bookings_not_canceled 36275 non-null int64 15 avg_price_per_room 36275 non-null float64 16 no_of_special_requests 36275 non-null int64 17 booking_status 36275 non-null object dtypes: float64(1), int64(13), object(4) memory usage: 5.0+ MB
data.isnull().sum()
no_of_adults 0 no_of_children 0 no_of_weekend_nights 0 no_of_week_nights 0 type_of_meal_plan 0 required_car_parking_space 0 room_type_reserved 0 lead_time 0 arrival_year 0 arrival_month 0 arrival_date 0 market_segment_type 0 repeated_guest 0 no_of_previous_cancellations 0 no_of_previous_bookings_not_canceled 0 avg_price_per_room 0 no_of_special_requests 0 booking_status 0 dtype: int64
There are no missing values in the dataset.
data.describe(include="all").T
| count | unique | top | freq | mean | std | min | 25% | 50% | 75% | max | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| no_of_adults | 36275.0 | NaN | NaN | NaN | 1.844962 | 0.518715 | 0.0 | 2.0 | 2.0 | 2.0 | 4.0 |
| no_of_children | 36275.0 | NaN | NaN | NaN | 0.105279 | 0.402648 | 0.0 | 0.0 | 0.0 | 0.0 | 10.0 |
| no_of_weekend_nights | 36275.0 | NaN | NaN | NaN | 0.810724 | 0.870644 | 0.0 | 0.0 | 1.0 | 2.0 | 7.0 |
| no_of_week_nights | 36275.0 | NaN | NaN | NaN | 2.2043 | 1.410905 | 0.0 | 1.0 | 2.0 | 3.0 | 17.0 |
| type_of_meal_plan | 36275 | 4 | Meal_Plan_1 | 27835 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| required_car_parking_space | 36275.0 | NaN | NaN | NaN | 0.030986 | 0.173281 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| room_type_reserved | 36275 | 7 | Room_Type_1 | 28130 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| lead_time | 36275.0 | NaN | NaN | NaN | 85.232557 | 85.930817 | 0.0 | 17.0 | 57.0 | 126.0 | 443.0 |
| arrival_year | 36275.0 | NaN | NaN | NaN | 2017.820427 | 0.383836 | 2017.0 | 2018.0 | 2018.0 | 2018.0 | 2018.0 |
| arrival_month | 36275.0 | NaN | NaN | NaN | 7.423653 | 3.069894 | 1.0 | 5.0 | 8.0 | 10.0 | 12.0 |
| arrival_date | 36275.0 | NaN | NaN | NaN | 15.596995 | 8.740447 | 1.0 | 8.0 | 16.0 | 23.0 | 31.0 |
| market_segment_type | 36275 | 5 | Online | 23214 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| repeated_guest | 36275.0 | NaN | NaN | NaN | 0.025637 | 0.158053 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| no_of_previous_cancellations | 36275.0 | NaN | NaN | NaN | 0.023349 | 0.368331 | 0.0 | 0.0 | 0.0 | 0.0 | 13.0 |
| no_of_previous_bookings_not_canceled | 36275.0 | NaN | NaN | NaN | 0.153411 | 1.754171 | 0.0 | 0.0 | 0.0 | 0.0 | 58.0 |
| avg_price_per_room | 36275.0 | NaN | NaN | NaN | 103.423539 | 35.089424 | 0.0 | 80.3 | 99.45 | 120.0 | 540.0 |
| no_of_special_requests | 36275.0 | NaN | NaN | NaN | 0.619655 | 0.786236 | 0.0 | 0.0 | 0.0 | 1.0 | 5.0 |
| booking_status | 36275 | 2 | Not_Canceled | 24390 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
# function to plot a boxplot and a histogram along the same scale.
def histogram_boxplot(data, feature, figsize=(12, 7), kde=False, bins=None):
"""
Boxplot and histogram combined
data: dataframe
feature: dataframe column
figsize: size of figure (default (12,7))
kde: whether to the show density curve (default False)
bins: number of bins for histogram (default None)
"""
f2, (ax_box2, ax_hist2) = plt.subplots(
nrows=2, # Number of rows of the subplot grid= 2
sharex=True, # x-axis will be shared among all subplots
gridspec_kw={"height_ratios": (0.25, 0.75)},
figsize=figsize,
) # creating the 2 subplots
sns.boxplot(
data=data, x=feature, ax=ax_box2, showmeans=True, color="violet"
) # boxplot will be created and a star will indicate the mean value of the column
sns.histplot(
data=data, x=feature, kde=kde, ax=ax_hist2, bins=bins, palette="winter"
) if bins else sns.histplot(
data=data, x=feature, kde=kde, ax=ax_hist2
) # For histogram
ax_hist2.axvline(
data[feature].mean(), color="green", linestyle="--"
) # Add mean to the histogram
ax_hist2.axvline(
data[feature].median(), color="black", linestyle="-"
) # Add median to the histogram
def labeled_barplot(data, feature, perc=False, n=None):
"""
Barplot with percentage at the top
data: dataframe
feature: dataframe column
perc: whether to display percentages instead of count (default is False)
n: displays the top n category levels (default is None, i.e., display all levels)
"""
total = len(data[feature]) # length of the column
count = data[feature].nunique()
if n is None:
plt.figure(figsize=(count + 1, 5))
else:
plt.figure(figsize=(n + 1, 5))
plt.xticks(rotation=90, fontsize=15)
ax = sns.countplot(
data=data,
x=feature,
palette="Paired",
order=data[feature].value_counts().index[:n].sort_values(),
)
for p in ax.patches:
if perc == True:
label = "{:.1f}%".format(
100 * p.get_height() / total
) # percentage of each class of the category
else:
label = p.get_height() # count of each level of the category
x = p.get_x() + p.get_width() / 2 # width of the plot
y = p.get_height() # height of the plot
ax.annotate(
label,
(x, y),
ha="center",
va="center",
size=12,
xytext=(0, 5),
textcoords="offset points",
) # annotate the percentage
plt.show() # show the plot
Leading Questions:
data1.columns
Index(['Booking_ID', 'no_of_adults', 'no_of_children', 'no_of_weekend_nights',
'no_of_week_nights', 'type_of_meal_plan', 'required_car_parking_space',
'room_type_reserved', 'lead_time', 'arrival_year', 'arrival_month',
'arrival_date', 'market_segment_type', 'repeated_guest',
'no_of_previous_cancellations', 'no_of_previous_bookings_not_canceled',
'avg_price_per_room', 'no_of_special_requests', 'booking_status'],
dtype='object')
labeled_barplot(data, "no_of_adults", perc=True)
The 72.% booking with number of adults 2, then 21.2% of booking with number of adult is 1 and so on. The lowest 0.0% booking with maximum number of adults. The 0.4% of booking with no adults could be outliers.
labeled_barplot(data, "no_of_children", perc=True)
92.6% booking with no children. The number 9 and 10 could be outliers.
# replacing 9, and 10 children with 3
data["no_of_children"] = data["no_of_children"].replace([9, 10], 3)
histogram_boxplot(data, "lead_time", kde=True)
lead_time distribution is rightly skewed and there are outliers in the distribution.
histogram_boxplot(data, "avg_price_per_room", kde=True)
The average price is distributed around 100 euros and slightly right skewed. There are outliers in the distribution.
labeled_barplot(data, "arrival_year", perc=True)
labeled_barplot(data, "no_of_weekend_nights", perc=True)
The percentage of booking to stay or styed at hotel decreases with increasing number of weekend nights.
labeled_barplot(data, "no_of_week_nights", perc=True)
The customers prefers mostly 1 or 2 nights in the weekdays with 31.5%. The percentage of booking decreases with number of night increases beyond 2.
labeled_barplot(data, "required_car_parking_space", perc=True)
The customer does not required car parking space with 96.9%. Only 3.1% customers require the car parking space.
histogram_boxplot(data, "no_of_previous_cancellations", kde=True)
There are outliers in the distribution.
histogram_boxplot(data, "no_of_previous_bookings_not_canceled", kde=True)
There are outliers in the distribution.
labeled_barplot(data, "no_of_special_requests", perc=True)
The higher percentage with no special requests. The next higher percentage with one special request.
labeled_barplot(data, "arrival_month", perc=True)
The month of arrival date is maximum in October and minimum in January.
labeled_barplot(data, "arrival_date", perc=True)
The arrival date is distributed nearly uniformly thoughout the month except on 31st.
data["type_of_meal_plan"].nunique()
4
labeled_barplot(data, "type_of_meal_plan", perc=True)
There are 4 type of meals. The customer prefers mostly meal plan 1. The second choice is other (not_selected). The customer does not prefer meal plan 3.
data["room_type_reserved"].nunique()
7
labeled_barplot(data, "room_type_reserved", perc=True)
There are 7 types of room to be reserved. The maximum preference is room type 1 then room type 4. The least preference is room type 3.
data["market_segment_type"].nunique()
5
labeled_barplot(data, "market_segment_type", perc=True)
The higher percentage in market segment designation is online. The lowest is avistion.
data["booking_status"].nunique()
2
labeled_barplot(data, "booking_status", perc=True)
67.2% booking was canceled and 32.8% booking not canceled.
data["booking_status"] = data["booking_status"].apply(
lambda x: 1 if x == "Canceled" else 0
)
labeled_barplot(data, "booking_status", perc=True)
plt.figure(figsize=(15, 7))
sns.heatmap(data.corr(), annot=True, vmin=-1, vmax=1, cmap="Spectral")
plt.show()
data.columns
Index(['no_of_adults', 'no_of_children', 'no_of_weekend_nights',
'no_of_week_nights', 'type_of_meal_plan', 'required_car_parking_space',
'room_type_reserved', 'lead_time', 'arrival_year', 'arrival_month',
'arrival_date', 'market_segment_type', 'repeated_guest',
'no_of_previous_cancellations', 'no_of_previous_bookings_not_canceled',
'avg_price_per_room', 'no_of_special_requests', 'booking_status'],
dtype='object')
Observations:
def distribution_plot_wrt_target(data, predictor, target):
fig, axs = plt.subplots(2, 2, figsize=(12, 10))
target_uniq = data[target].unique()
axs[0, 0].set_title("Distribution of target for target=" + str(target_uniq[0]))
sns.histplot(
data=data[data[target] == target_uniq[0]],
x=predictor,
kde=True,
ax=axs[0, 0],
color="teal",
stat="density",
)
axs[0, 1].set_title("Distribution of target for target=" + str(target_uniq[1]))
sns.histplot(
data=data[data[target] == target_uniq[1]],
x=predictor,
kde=True,
ax=axs[0, 1],
color="orange",
stat="density",
)
axs[1, 0].set_title("Boxplot w.r.t target")
sns.boxplot(data=data, x=target, y=predictor, ax=axs[1, 0], palette="gist_rainbow")
axs[1, 1].set_title("Boxplot (without outliers) w.r.t target")
sns.boxplot(
data=data,
x=target,
y=predictor,
ax=axs[1, 1],
showfliers=False,
palette="gist_rainbow",
)
plt.tight_layout()
plt.show()
plt.figure(figsize=(10, 6))
sns.boxplot(data=data, x="market_segment_type", y="avg_price_per_room")
plt.show()
plt.figure(figsize=(15, 5))
sns.barplot(data=data, x="market_segment_type", y="avg_price_per_room")
plt.xticks(rotation=90)
plt.show()
The ave price per room is maximum for Online and minimum for complementary. There are outliers in all the cases except Aviation.
def stacked_barplot(data, predictor, target):
"""
Print the category counts and plot a stacked bar chart
data: dataframe
predictor: independent variable
target: target variable
"""
count = data[predictor].nunique()
sorter = data[target].value_counts().index[-1]
tab1 = pd.crosstab(data[predictor], data[target], margins=True).sort_values(
by=sorter, ascending=False
)
print(tab1)
print("-" * 120)
tab = pd.crosstab(data[predictor], data[target], normalize="index").sort_values(
by=sorter, ascending=False
)
tab.plot(kind="bar", stacked=True, figsize=(count + 5, 5))
plt.legend(
loc="lower left", frameon=False,
)
plt.legend(loc="upper left", bbox_to_anchor=(1, 1))
plt.show()
stacked_barplot(data1, "market_segment_type", "booking_status")
booking_status Canceled Not_Canceled All market_segment_type All 11885 24390 36275 Online 8475 14739 23214 Offline 3153 7375 10528 Corporate 220 1797 2017 Aviation 37 88 125 Complementary 0 391 391 ------------------------------------------------------------------------------------------------------------------------
plt.figure(figsize=(15, 5))
sns.countplot(data=data, x="market_segment_type", hue="booking_status")
plt.xticks(rotation=90)
plt.show()
The cancelation for the case of online is higher compared to others. There is no cancelation for complementary segment.
data1.columns
Index(['Booking_ID', 'no_of_adults', 'no_of_children', 'no_of_weekend_nights',
'no_of_week_nights', 'type_of_meal_plan', 'required_car_parking_space',
'room_type_reserved', 'lead_time', 'arrival_year', 'arrival_month',
'arrival_date', 'market_segment_type', 'repeated_guest',
'no_of_previous_cancellations', 'no_of_previous_bookings_not_canceled',
'avg_price_per_room', 'no_of_special_requests', 'booking_status'],
dtype='object')
plt.figure(figsize=(15, 5))
sns.countplot(data=data, x="no_of_weekend_nights", hue="booking_status")
plt.xticks(rotation=90)
plt.show()
plt.figure(figsize=(15, 5))
sns.countplot(data=data, x="no_of_special_requests", hue="booking_status")
plt.xticks(rotation=90)
plt.show()
As the number of special request increases the booking cancelation decreases. That implies the higher the customer's satisfaction less in the cancelation.
distribution_plot_wrt_target(data, "avg_price_per_room", "booking_status")
df_ave_price = data[data["booking_status"] == 1]
df_ave_price.head()
| no_of_adults | no_of_children | no_of_weekend_nights | no_of_week_nights | type_of_meal_plan | required_car_parking_space | room_type_reserved | lead_time | arrival_year | arrival_month | arrival_date | market_segment_type | repeated_guest | no_of_previous_cancellations | no_of_previous_bookings_not_canceled | avg_price_per_room | no_of_special_requests | booking_status | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 0 | 2 | 1 | Meal_Plan_1 | 0 | Room_Type_1 | 1 | 2018 | 2 | 28 | Online | 0 | 0 | 0 | 60.0 | 0 | 1 |
| 3 | 2 | 0 | 0 | 2 | Meal_Plan_1 | 0 | Room_Type_1 | 211 | 2018 | 5 | 20 | Online | 0 | 0 | 0 | 100.0 | 0 | 1 |
| 4 | 2 | 0 | 1 | 1 | Not_Selected | 0 | Room_Type_1 | 48 | 2018 | 4 | 11 | Online | 0 | 0 | 0 | 94.5 | 0 | 1 |
| 5 | 2 | 0 | 0 | 2 | Meal_Plan_2 | 0 | Room_Type_1 | 346 | 2018 | 9 | 13 | Online | 0 | 0 | 0 | 115.0 | 1 | 1 |
| 12 | 2 | 0 | 2 | 1 | Not_Selected | 0 | Room_Type_1 | 30 | 2018 | 11 | 26 | Online | 0 | 0 | 0 | 88.0 | 0 | 1 |
histogram_boxplot(df_ave_price, "avg_price_per_room", kde=True)
The number of cancelation increases untill average price is 110 euros. The cancelation decreases beyond that.
plt.figure(figsize=(10, 6))
sns.boxplot(data=data, x="no_of_special_requests", y="avg_price_per_room")
plt.show()
The average price per room varies the no of special requests. For case of 4, the mean price is high.
distribution_plot_wrt_target(data1, "lead_time", "booking_status")
df_canceled = data[data["booking_status"] == 1]
df_canceled.head()
| no_of_adults | no_of_children | no_of_weekend_nights | no_of_week_nights | type_of_meal_plan | required_car_parking_space | room_type_reserved | lead_time | arrival_year | arrival_month | arrival_date | market_segment_type | repeated_guest | no_of_previous_cancellations | no_of_previous_bookings_not_canceled | avg_price_per_room | no_of_special_requests | booking_status | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 0 | 2 | 1 | Meal_Plan_1 | 0 | Room_Type_1 | 1 | 2018 | 2 | 28 | Online | 0 | 0 | 0 | 60.0 | 0 | 1 |
| 3 | 2 | 0 | 0 | 2 | Meal_Plan_1 | 0 | Room_Type_1 | 211 | 2018 | 5 | 20 | Online | 0 | 0 | 0 | 100.0 | 0 | 1 |
| 4 | 2 | 0 | 1 | 1 | Not_Selected | 0 | Room_Type_1 | 48 | 2018 | 4 | 11 | Online | 0 | 0 | 0 | 94.5 | 0 | 1 |
| 5 | 2 | 0 | 0 | 2 | Meal_Plan_2 | 0 | Room_Type_1 | 346 | 2018 | 9 | 13 | Online | 0 | 0 | 0 | 115.0 | 1 | 1 |
| 12 | 2 | 0 | 2 | 1 | Not_Selected | 0 | Room_Type_1 | 30 | 2018 | 11 | 26 | Online | 0 | 0 | 0 | 88.0 | 0 | 1 |
histogram_boxplot(df_canceled, "lead_time", kde=True)
The number of cancelation decreases with increasing lead time
family_data = data[(data["no_of_children"] >= 0) & (data["no_of_adults"] > 1)]
family_data.shape
(28441, 18)
family_data["no_of_family_members"] = (
family_data["no_of_adults"] + family_data["no_of_children"]
)
family_data.head()
| no_of_adults | no_of_children | no_of_weekend_nights | no_of_week_nights | type_of_meal_plan | required_car_parking_space | room_type_reserved | lead_time | arrival_year | arrival_month | arrival_date | market_segment_type | repeated_guest | no_of_previous_cancellations | no_of_previous_bookings_not_canceled | avg_price_per_room | no_of_special_requests | booking_status | no_of_family_members | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2 | 0 | 1 | 2 | Meal_Plan_1 | 0 | Room_Type_1 | 224 | 2017 | 10 | 2 | Offline | 0 | 0 | 0 | 65.00 | 0 | 0 | 2 |
| 1 | 2 | 0 | 2 | 3 | Not_Selected | 0 | Room_Type_1 | 5 | 2018 | 11 | 6 | Online | 0 | 0 | 0 | 106.68 | 1 | 0 | 2 |
| 3 | 2 | 0 | 0 | 2 | Meal_Plan_1 | 0 | Room_Type_1 | 211 | 2018 | 5 | 20 | Online | 0 | 0 | 0 | 100.00 | 0 | 1 | 2 |
| 4 | 2 | 0 | 1 | 1 | Not_Selected | 0 | Room_Type_1 | 48 | 2018 | 4 | 11 | Online | 0 | 0 | 0 | 94.50 | 0 | 1 | 2 |
| 5 | 2 | 0 | 0 | 2 | Meal_Plan_2 | 0 | Room_Type_1 | 346 | 2018 | 9 | 13 | Online | 0 | 0 | 0 | 115.00 | 1 | 1 | 2 |
histogram_boxplot(family_data, "no_of_family_members", kde=True)
stacked_barplot(family_data, "no_of_family_members", "booking_status")
booking_status 0 1 All no_of_family_members All 18456 9985 28441 2 15506 8213 23719 3 2425 1368 3793 4 514 398 912 5 11 6 17 ------------------------------------------------------------------------------------------------------------------------
The cancelation is higher for the case of family number 2. The increase in family member decreases the cancelation.
stay_data = data[(data["no_of_week_nights"] > 0) & (data["no_of_weekend_nights"] > 0)]
stay_data.shape
(17094, 18)
stay_data["total_days"] = (
stay_data["no_of_week_nights"] + stay_data["no_of_weekend_nights"]
)
stacked_barplot(stay_data, "total_days", "booking_status")
booking_status 0 1 All total_days All 10979 6115 17094 3 3689 2183 5872 4 2977 1387 4364 5 1593 738 2331 2 1301 639 1940 6 566 465 1031 7 590 383 973 8 100 79 179 10 51 58 109 9 58 53 111 14 5 27 32 15 5 26 31 13 3 15 18 12 9 15 24 11 24 15 39 20 3 8 11 19 1 5 6 16 1 5 6 17 1 4 5 18 0 3 3 21 1 3 4 22 0 2 2 23 1 1 2 24 0 1 1 ------------------------------------------------------------------------------------------------------------------------
The cancelation increases with the increase in total days.
stacked_barplot(data, "repeated_guest", "booking_status")
booking_status 0 1 All repeated_guest All 24390 11885 36275 0 23476 11869 35345 1 914 16 930 ------------------------------------------------------------------------------------------------------------------------
The cancelation is less for the repeated guest as they like the room in the hotel.
# grouping the data on arrival months and extracting the count of bookings
monthly_data = data.groupby(["arrival_month"])["booking_status"].count()
# creating a dataframe with months and count of customers in each month
monthly_data = pd.DataFrame(
{"Month": list(monthly_data.index), "Guests": list(monthly_data.values)}
)
# plotting the trend over different months
plt.figure(figsize=(10, 5))
sns.lineplot(data=monthly_data, x="Month", y="Guests")
plt.show()
stacked_barplot(data, "arrival_month", "booking_status")
booking_status 0 1 All arrival_month All 24390 11885 36275 10 3437 1880 5317 9 3073 1538 4611 8 2325 1488 3813 7 1606 1314 2920 6 1912 1291 3203 4 1741 995 2736 5 1650 948 2598 11 2105 875 2980 3 1658 700 2358 2 1274 430 1704 12 2619 402 3021 1 990 24 1014 ------------------------------------------------------------------------------------------------------------------------
The October is the busiest month. The cancelation is high in October and low in January.
plt.figure(figsize=(10, 5))
sns.lineplot(data=data, x="arrival_month", y="avg_price_per_room")
plt.show()
The price increases until September then decreases.
data[data["avg_price_per_room"] == 0]
| no_of_adults | no_of_children | no_of_weekend_nights | no_of_week_nights | type_of_meal_plan | required_car_parking_space | room_type_reserved | lead_time | arrival_year | arrival_month | arrival_date | market_segment_type | repeated_guest | no_of_previous_cancellations | no_of_previous_bookings_not_canceled | avg_price_per_room | no_of_special_requests | booking_status | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 63 | 1 | 0 | 0 | 1 | Meal_Plan_1 | 0 | Room_Type_1 | 2 | 2017 | 9 | 10 | Complementary | 0 | 0 | 0 | 0.0 | 1 | 0 |
| 145 | 1 | 0 | 0 | 2 | Meal_Plan_1 | 0 | Room_Type_1 | 13 | 2018 | 6 | 1 | Complementary | 1 | 3 | 5 | 0.0 | 1 | 0 |
| 209 | 1 | 0 | 0 | 0 | Meal_Plan_1 | 0 | Room_Type_1 | 4 | 2018 | 2 | 27 | Complementary | 0 | 0 | 0 | 0.0 | 1 | 0 |
| 266 | 1 | 0 | 0 | 2 | Meal_Plan_1 | 0 | Room_Type_1 | 1 | 2017 | 8 | 12 | Complementary | 1 | 0 | 1 | 0.0 | 1 | 0 |
| 267 | 1 | 0 | 2 | 1 | Meal_Plan_1 | 0 | Room_Type_1 | 4 | 2017 | 8 | 23 | Complementary | 0 | 0 | 0 | 0.0 | 1 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 35983 | 1 | 0 | 0 | 1 | Meal_Plan_1 | 0 | Room_Type_7 | 0 | 2018 | 6 | 7 | Complementary | 1 | 4 | 17 | 0.0 | 1 | 0 |
| 36080 | 1 | 0 | 1 | 1 | Meal_Plan_1 | 0 | Room_Type_7 | 0 | 2018 | 3 | 21 | Complementary | 1 | 3 | 15 | 0.0 | 1 | 0 |
| 36114 | 1 | 0 | 0 | 1 | Meal_Plan_1 | 0 | Room_Type_1 | 1 | 2018 | 3 | 2 | Online | 0 | 0 | 0 | 0.0 | 0 | 0 |
| 36217 | 2 | 0 | 2 | 1 | Meal_Plan_1 | 0 | Room_Type_2 | 3 | 2017 | 8 | 9 | Online | 0 | 0 | 0 | 0.0 | 2 | 0 |
| 36250 | 1 | 0 | 0 | 2 | Meal_Plan_2 | 0 | Room_Type_1 | 6 | 2017 | 12 | 10 | Online | 0 | 0 | 0 | 0.0 | 0 | 0 |
545 rows × 18 columns
data.loc[data["avg_price_per_room"] == 0, "market_segment_type"].value_counts()
Complementary 354 Online 191 Name: market_segment_type, dtype: int64
# Calculating the 25th quantile
Q1 = data["avg_price_per_room"].quantile(0.25)
# Calculating the 75th quantile
Q3 = data["avg_price_per_room"].quantile(
0.75
) ## Complete the code to calculate 75th quantile for average price per room
# Calculating IQR
IQR = Q3 - Q1
# Calculating value of upper whisker
Upper_Whisker = Q3 + 1.5 * IQR
Upper_Whisker
179.55
data.loc[data["avg_price_per_room"] >= 500, "avg_price_per_room"] = Upper_Whisker
numeric_columns = data.select_dtypes(include=np.number).columns.tolist()
numeric_columns.remove("booking_status")
plt.figure(figsize=(15, 12))
for i, variable in enumerate(numeric_columns):
plt.subplot(4, 4, i + 1)
plt.boxplot(data[variable], whis=1.5)
plt.tight_layout()
plt.title(variable)
plt.show()
data.columns
Index(['no_of_adults', 'no_of_children', 'no_of_weekend_nights',
'no_of_week_nights', 'type_of_meal_plan', 'required_car_parking_space',
'room_type_reserved', 'lead_time', 'arrival_year', 'arrival_month',
'arrival_date', 'market_segment_type', 'repeated_guest',
'no_of_previous_cancellations', 'no_of_previous_bookings_not_canceled',
'avg_price_per_room', 'no_of_special_requests', 'booking_status'],
dtype='object')
X = data.drop(["booking_status"], axis=1)
Y = data["booking_status"]
X = pd.get_dummies(X, drop_first=True)
# Splitting data in train and test sets
X_train, X_test, y_train, y_test = train_test_split(
X, Y, test_size=0.30, random_state=1
)
print("Shape of Training set : ", X_train.shape)
print("Shape of test set : ", X_test.shape)
print("Percentage of classes in training set:")
print(y_train.value_counts(normalize=True))
print("Percentage of classes in test set:")
print(y_test.value_counts(normalize=True))
Shape of Training set : (25392, 27) Shape of test set : (10883, 27) Percentage of classes in training set: 0 0.670644 1 0.329356 Name: booking_status, dtype: float64 Percentage of classes in test set: 0 0.676376 1 0.323624 Name: booking_status, dtype: float64
data.head(10)
| no_of_adults | no_of_children | no_of_weekend_nights | no_of_week_nights | type_of_meal_plan | required_car_parking_space | room_type_reserved | lead_time | arrival_year | arrival_month | arrival_date | market_segment_type | repeated_guest | no_of_previous_cancellations | no_of_previous_bookings_not_canceled | avg_price_per_room | no_of_special_requests | booking_status | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2 | 0 | 1 | 2 | Meal_Plan_1 | 0 | Room_Type_1 | 224 | 2017 | 10 | 2 | Offline | 0 | 0 | 0 | 65.00 | 0 | 0 |
| 1 | 2 | 0 | 2 | 3 | Not_Selected | 0 | Room_Type_1 | 5 | 2018 | 11 | 6 | Online | 0 | 0 | 0 | 106.68 | 1 | 0 |
| 2 | 1 | 0 | 2 | 1 | Meal_Plan_1 | 0 | Room_Type_1 | 1 | 2018 | 2 | 28 | Online | 0 | 0 | 0 | 60.00 | 0 | 1 |
| 3 | 2 | 0 | 0 | 2 | Meal_Plan_1 | 0 | Room_Type_1 | 211 | 2018 | 5 | 20 | Online | 0 | 0 | 0 | 100.00 | 0 | 1 |
| 4 | 2 | 0 | 1 | 1 | Not_Selected | 0 | Room_Type_1 | 48 | 2018 | 4 | 11 | Online | 0 | 0 | 0 | 94.50 | 0 | 1 |
| 5 | 2 | 0 | 0 | 2 | Meal_Plan_2 | 0 | Room_Type_1 | 346 | 2018 | 9 | 13 | Online | 0 | 0 | 0 | 115.00 | 1 | 1 |
| 6 | 2 | 0 | 1 | 3 | Meal_Plan_1 | 0 | Room_Type_1 | 34 | 2017 | 10 | 15 | Online | 0 | 0 | 0 | 107.55 | 1 | 0 |
| 7 | 2 | 0 | 1 | 3 | Meal_Plan_1 | 0 | Room_Type_4 | 83 | 2018 | 12 | 26 | Online | 0 | 0 | 0 | 105.61 | 1 | 0 |
| 8 | 3 | 0 | 0 | 4 | Meal_Plan_1 | 0 | Room_Type_1 | 121 | 2018 | 7 | 6 | Offline | 0 | 0 | 0 | 96.90 | 1 | 0 |
| 9 | 2 | 0 | 0 | 5 | Meal_Plan_1 | 0 | Room_Type_4 | 44 | 2018 | 10 | 18 | Online | 0 | 0 | 0 | 133.44 | 3 | 0 |
Predicting a customer will not cancel their booking but in reality, the customer will cancel their booking.
Predicting a customer will cancel their booking but in reality, the customer will not cancel their booking.
If we predict that a booking will not be canceled and the booking gets canceled then the hotel will lose resources and will have to bear additional costs.
If we predict that a booking will get canceled and the booking doesn't get canceled the hotel might not be able to provide satisfactory services to the customer by assuming that this booking will be canceled. This might damage the brand reputation.
Hotel would want F1 Score to be maximized, greater the F1 score higher are the chances of minimizing False Negatives and False Positives.
def model_performance_classification_statsmodels(
model, predictors, target, threshold=0.5
):
"""
Function to compute different metrics to check classification model performance
model: classifier
predictors: independent variables
target: dependent variable
threshold: threshold for classifying the observation as class 1
"""
# checking which probabilities are greater than threshold
pred_temp = model.predict(predictors) > threshold
# rounding off the above values to get classes
pred = np.round(pred_temp)
acc = accuracy_score(target, pred) # to compute Accuracy
recall = recall_score(target, pred) # to compute Recall
precision = precision_score(target, pred) # to compute Precision
f1 = f1_score(target, pred) # to compute F1-score
# creating a dataframe of metrics
df_perf = pd.DataFrame(
{"Accuracy": acc, "Recall": recall, "Precision": precision, "F1": f1,},
index=[0],
)
return df_perf
def confusion_matrix_statsmodels(model, predictors, target, threshold=0.5):
"""
To plot the confusion_matrix with percentages
model: classifier
predictors: independent variables
target: dependent variable
threshold: threshold for classifying the observation as class 1
"""
y_pred = model.predict(predictors) > threshold
cm = confusion_matrix(target, y_pred)
labels = np.asarray(
[
["{0:0.0f}".format(item) + "\n{0:.2%}".format(item / cm.flatten().sum())]
for item in cm.flatten()
]
).reshape(2, 2)
plt.figure(figsize=(6, 4))
sns.heatmap(cm, annot=labels, fmt="")
plt.ylabel("True label")
plt.xlabel("Predicted label")
X = data.drop(["booking_status"], axis=1)
Y = data["booking_status"]
X = pd.get_dummies(X, drop_first=True)
# adding constant
X = sm.add_constant(X)
# Splitting data in train and test sets
X_train, X_test, y_train, y_test = train_test_split(
X, Y, test_size=0.30, random_state=1
)
# fitting logistic regression model
logit = sm.Logit(y_train, X_train.astype(float))
lg = logit.fit(disp=False)
print(lg.summary())
Logit Regression Results
==============================================================================
Dep. Variable: booking_status No. Observations: 25392
Model: Logit Df Residuals: 25364
Method: MLE Df Model: 27
Date: Fri, 25 Feb 2022 Pseudo R-squ.: 0.3292
Time: 11:37:37 Log-Likelihood: -10794.
converged: False LL-Null: -16091.
Covariance Type: nonrobust LLR p-value: 0.000
========================================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------------------------
const -922.8266 120.832 -7.637 0.000 -1159.653 -686.000
no_of_adults 0.1137 0.038 3.019 0.003 0.040 0.188
no_of_children 0.1580 0.062 2.544 0.011 0.036 0.280
no_of_weekend_nights 0.1067 0.020 5.395 0.000 0.068 0.145
no_of_week_nights 0.0397 0.012 3.235 0.001 0.016 0.064
required_car_parking_space -1.5943 0.138 -11.565 0.000 -1.865 -1.324
lead_time 0.0157 0.000 58.863 0.000 0.015 0.016
arrival_year 0.4561 0.060 7.617 0.000 0.339 0.573
arrival_month -0.0417 0.006 -6.441 0.000 -0.054 -0.029
arrival_date 0.0005 0.002 0.259 0.796 -0.003 0.004
repeated_guest -2.3472 0.617 -3.806 0.000 -3.556 -1.139
no_of_previous_cancellations 0.2664 0.086 3.108 0.002 0.098 0.434
no_of_previous_bookings_not_canceled -0.1727 0.153 -1.131 0.258 -0.472 0.127
avg_price_per_room 0.0188 0.001 25.396 0.000 0.017 0.020
no_of_special_requests -1.4689 0.030 -48.782 0.000 -1.528 -1.410
type_of_meal_plan_Meal_Plan_2 0.1756 0.067 2.636 0.008 0.045 0.306
type_of_meal_plan_Meal_Plan_3 17.3584 3987.841 0.004 0.997 -7798.666 7833.383
type_of_meal_plan_Not_Selected 0.2784 0.053 5.247 0.000 0.174 0.382
room_type_reserved_Room_Type_2 -0.3605 0.131 -2.748 0.006 -0.618 -0.103
room_type_reserved_Room_Type_3 -0.0012 1.310 -0.001 0.999 -2.568 2.566
room_type_reserved_Room_Type_4 -0.2823 0.053 -5.304 0.000 -0.387 -0.178
room_type_reserved_Room_Type_5 -0.7189 0.209 -3.438 0.001 -1.129 -0.309
room_type_reserved_Room_Type_6 -0.9501 0.151 -6.274 0.000 -1.247 -0.653
room_type_reserved_Room_Type_7 -1.4003 0.294 -4.770 0.000 -1.976 -0.825
market_segment_type_Complementary -40.5976 5.65e+05 -7.19e-05 1.000 -1.11e+06 1.11e+06
market_segment_type_Corporate -1.1924 0.266 -4.483 0.000 -1.714 -0.671
market_segment_type_Offline -2.1946 0.255 -8.621 0.000 -2.694 -1.696
market_segment_type_Online -0.3995 0.251 -1.590 0.112 -0.892 0.093
========================================================================================================
print("Training performance:")
model_performance_classification_statsmodels(lg, X_train, y_train)
Training performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.806002 | 0.634103 | 0.739713 | 0.682848 |
Negative values of the coefficient shows that the chances of a customer will cancel his booking, decreases with the increase of corresponding attribute value.
Positive values of the coefficient show that the chances of a customer will cancel his booking, increases with the increase of corresponding attribute value.
p-value of a variable indicates if the variable is significant or not. If we consider the significance level to be 0.05 (5%), then any variable with a p-value less than 0.05 would be considered significant.
But these variables might contain multicollinearity, which will affect the p-values.
We will have to remove multicollinearity from the data to get reliable coefficients and p-values.
There are different ways of detecting (or testing) multi-collinearity, one such way is the Variation Inflation Factor.
vif_series = pd.Series(
[variance_inflation_factor(X_train.values, i) for i in range(X_train.shape[1])],
index=X_train.columns,
dtype=float,
)
print("Series before feature selection: \n\n{}\n".format(vif_series))
Series before feature selection: const 3.949769e+07 no_of_adults 1.351135e+00 no_of_children 2.093583e+00 no_of_weekend_nights 1.069484e+00 no_of_week_nights 1.095711e+00 required_car_parking_space 1.039972e+00 lead_time 1.395175e+00 arrival_year 1.431904e+00 arrival_month 1.276334e+00 arrival_date 1.006795e+00 repeated_guest 1.783576e+00 no_of_previous_cancellations 1.395693e+00 no_of_previous_bookings_not_canceled 1.652000e+00 avg_price_per_room 2.068603e+00 no_of_special_requests 1.247981e+00 type_of_meal_plan_Meal_Plan_2 1.273283e+00 type_of_meal_plan_Meal_Plan_3 1.025258e+00 type_of_meal_plan_Not_Selected 1.273060e+00 room_type_reserved_Room_Type_2 1.105954e+00 room_type_reserved_Room_Type_3 1.003303e+00 room_type_reserved_Room_Type_4 1.363606e+00 room_type_reserved_Room_Type_5 1.028000e+00 room_type_reserved_Room_Type_6 2.056136e+00 room_type_reserved_Room_Type_7 1.118156e+00 market_segment_type_Complementary 4.502756e+00 market_segment_type_Corporate 1.692829e+01 market_segment_type_Offline 6.411564e+01 market_segment_type_Online 7.118026e+01 dtype: float64
market_segment_type exhibits high multicollinearity.
X_train1 = X_train.drop("market_segment_type_Offline", axis=1)
vif_series2 = pd.Series(
[variance_inflation_factor(X_train1.values, i) for i in range(X_train1.shape[1])],
index=X_train1.columns,
)
print("Series before feature selection: \n\n{}\n".format(vif_series2))
Series before feature selection: const 3.938645e+07 no_of_adults 1.335722e+00 no_of_children 2.093156e+00 no_of_weekend_nights 1.068500e+00 no_of_week_nights 1.094772e+00 required_car_parking_space 1.039687e+00 lead_time 1.387572e+00 arrival_year 1.428273e+00 arrival_month 1.275581e+00 arrival_date 1.006764e+00 repeated_guest 1.780592e+00 no_of_previous_cancellations 1.395539e+00 no_of_previous_bookings_not_canceled 1.651671e+00 avg_price_per_room 2.068228e+00 no_of_special_requests 1.247588e+00 type_of_meal_plan_Meal_Plan_2 1.272878e+00 type_of_meal_plan_Meal_Plan_3 1.025257e+00 type_of_meal_plan_Not_Selected 1.273000e+00 room_type_reserved_Room_Type_2 1.105943e+00 room_type_reserved_Room_Type_3 1.003302e+00 room_type_reserved_Room_Type_4 1.354465e+00 room_type_reserved_Room_Type_5 1.027966e+00 room_type_reserved_Room_Type_6 2.055870e+00 room_type_reserved_Room_Type_7 1.118118e+00 market_segment_type_Complementary 1.316269e+00 market_segment_type_Corporate 1.528613e+00 market_segment_type_Online 1.775374e+00 dtype: float64
The multicolinearity is treated.
logit2 = sm.Logit(y_train, X_train1.astype(float))
lg2 = logit2.fit()
print("Training performance:")
model_performance_classification_statsmodels(lg2, X_train1, y_train)
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.426213
Iterations: 35
Training performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.806278 | 0.63494 | 0.739967 | 0.683442 |
print(lg2.summary())
Logit Regression Results
==============================================================================
Dep. Variable: booking_status No. Observations: 25392
Model: Logit Df Residuals: 25365
Method: MLE Df Model: 26
Date: Fri, 25 Feb 2022 Pseudo R-squ.: 0.3274
Time: 11:38:30 Log-Likelihood: -10822.
converged: False LL-Null: -16091.
Covariance Type: nonrobust LLR p-value: 0.000
========================================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------------------------
const -964.0915 120.422 -8.006 0.000 -1200.115 -728.068
no_of_adults 0.0887 0.037 2.369 0.018 0.015 0.162
no_of_children 0.1537 0.062 2.480 0.013 0.032 0.275
no_of_weekend_nights 0.1119 0.020 5.662 0.000 0.073 0.151
no_of_week_nights 0.0438 0.012 3.559 0.000 0.020 0.068
required_car_parking_space -1.5760 0.138 -11.440 0.000 -1.846 -1.306
lead_time 0.0155 0.000 58.692 0.000 0.015 0.016
arrival_year 0.4755 0.060 7.967 0.000 0.359 0.592
arrival_month -0.0402 0.006 -6.233 0.000 -0.053 -0.028
arrival_date 0.0006 0.002 0.292 0.771 -0.003 0.004
repeated_guest -2.2466 0.619 -3.627 0.000 -3.461 -1.033
no_of_previous_cancellations 0.2568 0.086 2.997 0.003 0.089 0.425
no_of_previous_bookings_not_canceled -0.1730 0.151 -1.146 0.252 -0.469 0.123
avg_price_per_room 0.0187 0.001 25.357 0.000 0.017 0.020
no_of_special_requests -1.4648 0.030 -48.784 0.000 -1.524 -1.406
type_of_meal_plan_Meal_Plan_2 0.1674 0.066 2.524 0.012 0.037 0.297
type_of_meal_plan_Meal_Plan_3 24.3434 1.32e+05 0.000 1.000 -2.58e+05 2.58e+05
type_of_meal_plan_Not_Selected 0.2802 0.053 5.291 0.000 0.176 0.384
room_type_reserved_Room_Type_2 -0.3552 0.131 -2.713 0.007 -0.612 -0.099
room_type_reserved_Room_Type_3 -0.0039 1.302 -0.003 0.998 -2.555 2.548
room_type_reserved_Room_Type_4 -0.2533 0.053 -4.766 0.000 -0.357 -0.149
room_type_reserved_Room_Type_5 -0.7224 0.208 -3.468 0.001 -1.131 -0.314
room_type_reserved_Room_Type_6 -0.9325 0.151 -6.170 0.000 -1.229 -0.636
room_type_reserved_Room_Type_7 -1.3818 0.293 -4.714 0.000 -1.956 -0.807
market_segment_type_Complementary -61.6070 1.81e+09 -3.41e-08 1.000 -3.54e+09 3.54e+09
market_segment_type_Corporate 0.9452 0.106 8.899 0.000 0.737 1.153
market_segment_type_Online 1.7474 0.051 33.957 0.000 1.647 1.848
========================================================================================================
# running a loop to drop variables with high p-value
# initial list of columns
cols = X_train1.columns.tolist()
# setting an initial max p-value
max_p_value = 1
while len(cols) > 0:
# defining the train set
X_train_aux = X_train1[cols]
# fitting the model
model = sm.Logit(y_train, X_train_aux).fit(disp=False)
# getting the p-values and the maximum p-value
p_values = model.pvalues
max_p_value = max(p_values)
# name of the variable with maximum p-value
feature_with_p_max = p_values.idxmax()
if max_p_value > 0.05:
cols.remove(feature_with_p_max)
else:
break
selected_features = cols
print(selected_features)
['const', 'no_of_adults', 'no_of_children', 'no_of_weekend_nights', 'no_of_week_nights', 'required_car_parking_space', 'lead_time', 'arrival_year', 'arrival_month', 'repeated_guest', 'no_of_previous_cancellations', 'avg_price_per_room', 'no_of_special_requests', 'type_of_meal_plan_Meal_Plan_2', 'type_of_meal_plan_Not_Selected', 'room_type_reserved_Room_Type_2', 'room_type_reserved_Room_Type_4', 'room_type_reserved_Room_Type_5', 'room_type_reserved_Room_Type_6', 'room_type_reserved_Room_Type_7', 'market_segment_type_Corporate', 'market_segment_type_Online']
X_train3 = X_train1[selected_features]
logit5 = sm.Logit(y_train, X_train3.astype(float))
lg5 = logit5.fit(disp=False)
print(lg5.summary())
Logit Regression Results
==============================================================================
Dep. Variable: booking_status No. Observations: 25392
Model: Logit Df Residuals: 25370
Method: MLE Df Model: 21
Date: Fri, 25 Feb 2022 Pseudo R-squ.: 0.3270
Time: 11:38:36 Log-Likelihood: -10829.
converged: True LL-Null: -16091.
Covariance Type: nonrobust LLR p-value: 0.000
==================================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------------------
const -951.5524 120.311 -7.909 0.000 -1187.358 -715.747
no_of_adults 0.0903 0.037 2.413 0.016 0.017 0.164
no_of_children 0.1526 0.062 2.464 0.014 0.031 0.274
no_of_weekend_nights 0.1116 0.020 5.649 0.000 0.073 0.150
no_of_week_nights 0.0442 0.012 3.598 0.000 0.020 0.068
required_car_parking_space -1.5778 0.138 -11.452 0.000 -1.848 -1.308
lead_time 0.0155 0.000 58.838 0.000 0.015 0.016
arrival_year 0.4693 0.060 7.870 0.000 0.352 0.586
arrival_month -0.0408 0.006 -6.337 0.000 -0.053 -0.028
repeated_guest -2.6667 0.560 -4.764 0.000 -3.764 -1.570
no_of_previous_cancellations 0.2214 0.077 2.878 0.004 0.071 0.372
avg_price_per_room 0.0189 0.001 25.665 0.000 0.017 0.020
no_of_special_requests -1.4655 0.030 -48.831 0.000 -1.524 -1.407
type_of_meal_plan_Meal_Plan_2 0.1613 0.066 2.433 0.015 0.031 0.291
type_of_meal_plan_Not_Selected 0.2823 0.053 5.331 0.000 0.179 0.386
room_type_reserved_Room_Type_2 -0.3528 0.131 -2.696 0.007 -0.609 -0.096
room_type_reserved_Room_Type_4 -0.2573 0.053 -4.843 0.000 -0.361 -0.153
room_type_reserved_Room_Type_5 -0.7273 0.208 -3.495 0.000 -1.135 -0.319
room_type_reserved_Room_Type_6 -0.9427 0.151 -6.238 0.000 -1.239 -0.647
room_type_reserved_Room_Type_7 -1.3978 0.293 -4.768 0.000 -1.972 -0.823
market_segment_type_Corporate 0.9437 0.106 8.893 0.000 0.736 1.152
market_segment_type_Online 1.7479 0.051 33.989 0.000 1.647 1.849
==================================================================================================
Now no feature has p-value greater than 0.05, so we'll consider the features in X_train3 as the final ones and lg5 as final model.
# converting coefficients to odds
odds = np.exp(lg5.params)
# finding the percentage change
perc_change_odds = (np.exp(lg5.params) - 1) * 100
# removing limit from number of columns to display
pd.set_option("display.max_columns", None)
# adding the odds to a dataframe
pd.DataFrame({"Odds": odds, "Change_odd%": perc_change_odds}, index=X_train3.columns).T
| const | no_of_adults | no_of_children | no_of_weekend_nights | no_of_week_nights | required_car_parking_space | lead_time | arrival_year | arrival_month | repeated_guest | no_of_previous_cancellations | avg_price_per_room | no_of_special_requests | type_of_meal_plan_Meal_Plan_2 | type_of_meal_plan_Not_Selected | room_type_reserved_Room_Type_2 | room_type_reserved_Room_Type_4 | room_type_reserved_Room_Type_5 | room_type_reserved_Room_Type_6 | room_type_reserved_Room_Type_7 | market_segment_type_Corporate | market_segment_type_Online | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Odds | 0.0 | 1.094535 | 1.164893 | 1.118099 | 1.045241 | 0.206425 | 1.015632 | 1.598837 | 0.959984 | 0.069479 | 1.247859 | 1.019052 | 0.230973 | 1.175092 | 1.326189 | 0.702708 | 0.773140 | 0.483221 | 0.389587 | 0.247151 | 2.569518 | 5.742355 |
| Change_odd% | -100.0 | 9.453510 | 16.489306 | 11.809905 | 4.524125 | -79.357516 | 1.563191 | 59.883746 | -4.001602 | -93.052054 | 24.785905 | 1.905220 | -76.902741 | 17.509219 | 32.618921 | -29.729225 | -22.685992 | -51.677866 | -61.041292 | -75.284939 | 156.951849 | 474.235513 |
no_of_adults: Holding all other features constant a 1 unit change in no_of_adults will increase the odds of a person a customer will cancel their booking by 1.094535 times or a 9.40% increase in odds of person a customer will cancel their booking.arrival_month: Holding all other features constant a 1 unit change in arrival_month will decrease the odds of a person a customer will cancel their booking by 0.959984 times or a 4% decrease in odds of person a customer will cancel their booking.avg_price_per_room: Holding all other features constant a 1 unit change in avg_price_per_room will increase the odds of a person a customer will not cancel their booking by 1.019052 times or a 2% increase in odds of person a customer will not cancel their booking.
Interpretation for other attributes can be done similarly.# defining a function to plot the confusion_matrix of a classification model
def confusion_matrix_statsmodels(model, predictors, target, threshold=0.5):
"""
To plot the confusion_matrix with percentages
model: classifier
predictors: independent variables
target: dependent variable
threshold: threshold for classifying the observation as class 1
"""
y_pred = model.predict(predictors) > threshold
cm = confusion_matrix(target, y_pred)
labels = np.asarray(
[
["{0:0.0f}".format(item) + "\n{0:.2%}".format(item / cm.flatten().sum())]
for item in cm.flatten()
]
).reshape(2, 2)
plt.figure(figsize=(6, 4))
sns.heatmap(cm, annot=labels, fmt="")
plt.ylabel("True label")
plt.xlabel("Predicted label")
# creating confusion matrix
confusion_matrix_statsmodels(lg5, X_train3, y_train)
log_reg_model_train_perf = model_performance_classification_statsmodels(
lg5, X_train3, y_train
)
print("Training performance:")
log_reg_model_train_perf
Training performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.806159 | 0.634461 | 0.739925 | 0.683147 |
logit_roc_auc_train = roc_auc_score(y_train, lg5.predict(X_train3))
fpr, tpr, thresholds = roc_curve(y_train, lg5.predict(X_train3))
plt.figure(figsize=(7, 5))
plt.plot(fpr, tpr, label="Logistic Regression (area = %0.2f)" % logit_roc_auc_train)
plt.plot([0, 1], [0, 1], "r--")
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("Receiver operating characteristic")
plt.legend(loc="lower right")
plt.show()
# Optimal threshold as per AUC-ROC curve
# The optimal cut off would be where tpr is high and fpr is low
fpr, tpr, thresholds = roc_curve(y_train, lg5.predict(X_train3))
optimal_idx = np.argmax(tpr - fpr)
optimal_threshold_auc_roc = thresholds[optimal_idx]
print(optimal_threshold_auc_roc)
0.3709246479861554
# creating confusion matrix
confusion_matrix_statsmodels(
lg5, X_train3, y_train, threshold=optimal_threshold_auc_roc
)
# checking model performance for this model
log_reg_model_train_perf_threshold_auc_roc = model_performance_classification_statsmodels(
lg5, X_train3, y_train, threshold=optimal_threshold_auc_roc
)
print("Training performance:")
log_reg_model_train_perf_threshold_auc_roc
Training performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.791509 | 0.735741 | 0.666125 | 0.699205 |
y_scores = lg5.predict(X_train3)
prec, rec, tre = precision_recall_curve(y_train, y_scores,)
def plot_prec_recall_vs_tresh(precisions, recalls, thresholds):
plt.plot(thresholds, precisions[:-1], "b--", label="precision")
plt.plot(thresholds, recalls[:-1], "g--", label="recall")
plt.xlabel("Threshold")
plt.legend(loc="upper left")
plt.ylim([0, 1])
plt.figure(figsize=(10, 7))
plot_prec_recall_vs_tresh(prec, rec, tre)
plt.show()
# setting the threshold
optimal_threshold_curve = 0.425
# creating confusion matrix
confusion_matrix_statsmodels(lg5, X_train3, y_train, threshold=optimal_threshold_curve)
log_reg_model_train_perf_threshold_curve = model_performance_classification_statsmodels(
lg5, X_train3, y_train, threshold=optimal_threshold_curve
)
print("Training performance:")
log_reg_model_train_perf_threshold_curve
Training performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.801867 | 0.694846 | 0.700965 | 0.697892 |
# training performance comparison
models_train_comp_df = pd.concat(
[
log_reg_model_train_perf.T,
log_reg_model_train_perf_threshold_auc_roc.T,
log_reg_model_train_perf_threshold_curve.T,
],
axis=1,
)
models_train_comp_df.columns = [
"Logistic Regression-default Threshold (0.5)",
"Logistic Regression-0.76 Threshold",
"Logistic Regression-0.425 Threshold",
]
print("Training performance comparison:")
models_train_comp_df
Training performance comparison:
| Logistic Regression-default Threshold (0.5) | Logistic Regression-0.76 Threshold | Logistic Regression-0.425 Threshold | |
|---|---|---|---|
| Accuracy | 0.806159 | 0.791509 | 0.801867 |
| Recall | 0.634461 | 0.735741 | 0.694846 |
| Precision | 0.739925 | 0.666125 | 0.700965 |
| F1 | 0.683147 | 0.699205 | 0.697892 |
# training performance comparison
models_train_comp_df = pd.concat(
[
log_reg_model_train_perf.T,
log_reg_model_train_perf_threshold_auc_roc.T,
log_reg_model_train_perf_threshold_curve.T,
],
axis=1,
)
models_train_comp_df.columns = [
"Logistic Regression-default Threshold (0.425)",
"Logistic Regression-0.35 Threshold",
"Logistic Regression-0.5 Threshold",
]
print("Training performance comparison:")
models_train_comp_df
Training performance comparison:
| Logistic Regression-default Threshold (0.425) | Logistic Regression-0.35 Threshold | Logistic Regression-0.5 Threshold | |
|---|---|---|---|
| Accuracy | 0.806159 | 0.791509 | 0.801867 |
| Recall | 0.634461 | 0.735741 | 0.694846 |
| Precision | 0.739925 | 0.666125 | 0.700965 |
| F1 | 0.683147 | 0.699205 | 0.697892 |
X_test5 = X_test[list(X_train3.columns)]
# creating confusion matrix
confusion_matrix_statsmodels(lg5, X_test5, y_test)
log_reg_model_test_perf = model_performance_classification_statsmodels(
lg5, X_test5, y_test
)
print("Test performance:")
log_reg_model_test_perf
Test performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.80566 | 0.633163 | 0.730429 | 0.678327 |
logit_roc_auc_train = roc_auc_score(y_test, lg5.predict(X_test5))
fpr, tpr, thresholds = roc_curve(y_test, lg5.predict(X_test5))
plt.figure(figsize=(7, 5))
plt.plot(fpr, tpr, label="Logistic Regression (area = %0.2f)" % logit_roc_auc_train)
plt.plot([0, 1], [0, 1], "r--")
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("Receiver operating characteristic")
plt.legend(loc="lower right")
plt.show()
confusion_matrix_statsmodels(lg5, X_test5, y_test, threshold=optimal_threshold_auc_roc)
# checking model performance for this model
log_reg_model_test_perf_threshold_auc_roc = model_performance_classification_statsmodels(
lg5, X_test5, y_test, threshold=optimal_threshold_auc_roc
)
print("Test performance:")
log_reg_model_test_perf_threshold_auc_roc
Test performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.794909 | 0.738501 | 0.664877 | 0.699758 |
# creating confusion matrix
confusion_matrix_statsmodels(lg5, X_test5, y_test, threshold=optimal_threshold_curve)
log_reg_model_test_perf_threshold_curve = model_performance_classification_statsmodels(
lg5, X_test5, y_test, threshold=optimal_threshold_curve
)
print("Test performance:")
log_reg_model_test_perf_threshold_curve
Test performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.804833 | 0.701022 | 0.697458 | 0.699235 |
# training performance comparison
models_train_comp_df = pd.concat(
[
log_reg_model_train_perf.T,
log_reg_model_train_perf_threshold_auc_roc.T,
log_reg_model_train_perf_threshold_curve.T,
],
axis=1,
)
models_train_comp_df.columns = [
"Logistic Regression-default Threshold (0.5)",
"Logistic Regression-0.76 Threshold",
"Logistic Regression-0.424 Threshold",
]
print("Training performance comparison:")
models_train_comp_df
Training performance comparison:
| Logistic Regression-default Threshold (0.5) | Logistic Regression-0.76 Threshold | Logistic Regression-0.424 Threshold | |
|---|---|---|---|
| Accuracy | 0.806159 | 0.791509 | 0.801867 |
| Recall | 0.634461 | 0.735741 | 0.694846 |
| Precision | 0.739925 | 0.666125 | 0.700965 |
| F1 | 0.683147 | 0.699205 | 0.697892 |
# testing performance comparison
models_test_comp_df = pd.concat(
[
log_reg_model_test_perf.T,
log_reg_model_test_perf_threshold_auc_roc.T,
log_reg_model_test_perf_threshold_curve.T,
],
axis=1,
)
models_test_comp_df.columns = [
"Logistic Regression-default Threshold (0.5)",
"Logistic Regression-0.76 Threshold",
"Logistic Regression-0.425 Threshold",
]
print("Test set performance comparison:")
models_test_comp_df
Test set performance comparison:
| Logistic Regression-default Threshold (0.5) | Logistic Regression-0.76 Threshold | Logistic Regression-0.425 Threshold | |
|---|---|---|---|
| Accuracy | 0.805660 | 0.794909 | 0.804833 |
| Recall | 0.633163 | 0.738501 | 0.701022 |
| Precision | 0.730429 | 0.664877 | 0.697458 |
| F1 | 0.678327 | 0.699758 | 0.699235 |
We have been able to build a predictive model that can be used by the hotel to find the customer will cancel the booking with an f1_score of 0.69 on the training set and formulate accordingly.
All the logistic regression models have given a generalized performance on the training and test set.
Coefficient of no of family members, lead time, and average price per room are positive, an increase in these will lead to increase in chances of a customer will cancel.
for the case of negative Coefficient: increase in these will lead to decrease in chances of a customer will cancel.
X = data.drop(["booking_status"], axis=1)
Y = data["booking_status"]
X = pd.get_dummies(X, drop_first=True)
# Splitting data in train and test sets
X_train, X_test, y_train, y_test = train_test_split(
X, Y, test_size=0.30, random_state=1
)
print("Number of rows in train data =", X_train.shape[0])
print("Number of rows in test data =", X_test.shape[0])
Number of rows in train data = 25392 Number of rows in test data = 10883
print("Percentage of classes in training set:")
print(y_train.value_counts(normalize=True))
print("Percentage of classes in test set:")
print(y_test.value_counts(normalize=True))
Percentage of classes in training set: 0 0.670644 1 0.329356 Name: booking_status, dtype: float64 Percentage of classes in test set: 0 0.676376 1 0.323624 Name: booking_status, dtype: float64
# defining a function to compute different metrics to check performance of a classification model built using sklearn
def model_performance_classification_sklearn(model, predictors, target):
"""
Function to compute different metrics to check classification model performance
model: classifier
predictors: independent variables
target: dependent variable
"""
# predicting using the independent variables
pred = model.predict(predictors)
acc = accuracy_score(target, pred) # to compute Accuracy
recall = recall_score(target, pred) # to compute Recall
precision = precision_score(target, pred) # to compute Precision
f1 = f1_score(target, pred) # to compute F1-score
# creating a dataframe of metrics
df_perf = pd.DataFrame(
{"Accuracy": acc, "Recall": recall, "Precision": precision, "F1": f1,},
index=[0],
)
return df_perf
def confusion_matrix_sklearn(model, predictors, target):
"""
To plot the confusion_matrix with percentages
model: classifier
predictors: independent variables
target: dependent variable
"""
y_pred = model.predict(predictors)
cm = confusion_matrix(target, y_pred)
labels = np.asarray(
[
["{0:0.0f}".format(item) + "\n{0:.2%}".format(item / cm.flatten().sum())]
for item in cm.flatten()
]
).reshape(2, 2)
plt.figure(figsize=(6, 4))
sns.heatmap(cm, annot=labels, fmt="")
plt.ylabel("True label")
plt.xlabel("Predicted label")
from sklearn.tree import DecisionTreeClassifier
from sklearn import tree
# To tune different models
from sklearn.model_selection import GridSearchCV
# To get diferent metric scores
from sklearn.metrics import (
f1_score,
accuracy_score,
recall_score,
precision_score,
confusion_matrix,
plot_confusion_matrix,
make_scorer,
)
model = DecisionTreeClassifier(
criterion="gini", class_weight={0: 0.15, 1: 0.85}, random_state=1
)
model.fit(X_train, y_train)
DecisionTreeClassifier(class_weight={0: 0.15, 1: 0.85}, random_state=1)
confusion_matrix_sklearn(model, X_train, y_train)
decision_tree_perf_train = model_performance_classification_sklearn(
model, X_train, y_train
)
decision_tree_perf_train
# print("Hyperpaarmeters:", decision_tree_perf_train)
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.990312 | 0.998326 | 0.972964 | 0.985482 |
confusion_matrix_sklearn(model, X_test, y_test)
decision_tree_perf_test = model_performance_classification_sklearn(
model, X_test, y_test
)
decision_tree_perf_test
# print("Hyperpaarmeters:", decision_tree_perf_test)
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.867867 | 0.803521 | 0.791387 | 0.797408 |
feature_names = X_train.columns.to_list()
plt.figure(figsize=(20, 30))
out = tree.plot_tree(
model,
feature_names=feature_names,
filled=True,
fontsize=9,
node_ids=False,
class_names=None,
)
# below code will add arrows to the decision tree split if they are missing
for o in out:
arrow = o.arrow_patch
if arrow is not None:
arrow.set_edgecolor("black")
arrow.set_linewidth(1)
plt.show()
print(tree.export_text(model, feature_names=feature_names, show_weights=True))
|--- lead_time <= 90.50 | |--- no_of_special_requests <= 1.50 | | |--- market_segment_type_Online <= 0.50 | | | |--- no_of_weekend_nights <= 0.50 | | | | |--- market_segment_type_Corporate <= 0.50 | | | | | |--- avg_price_per_room <= 178.44 | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | |--- avg_price_per_room <= 92.50 | | | | | | | | |--- weights: [25.65, 0.00] class: 0 | | | | | | | |--- avg_price_per_room > 92.50 | | | | | | | | |--- arrival_date <= 23.50 | | | | | | | | | |--- arrival_date <= 19.50 | | | | | | | | | | |--- lead_time <= 3.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- lead_time > 3.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- arrival_date > 19.50 | | | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | | | |--- weights: [0.90, 0.00] class: 0 | | | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | |--- arrival_date > 23.50 | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | |--- no_of_previous_bookings_not_canceled <= 0.50 | | | | | | | | | | | |--- weights: [0.00, 4.25] class: 1 | | | | | | | | | | |--- no_of_previous_bookings_not_canceled > 0.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | |--- lead_time <= 6.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 6.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | |--- weights: [267.90, 0.00] class: 0 | | | | | |--- avg_price_per_room > 178.44 | | | | | | |--- arrival_date <= 25.50 | | | | | | | |--- lead_time <= 18.00 | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | |--- lead_time > 18.00 | | | | | | | | |--- weights: [0.00, 13.60] class: 1 | | | | | | |--- arrival_date > 25.50 | | | | | | | |--- avg_price_per_room <= 184.00 | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | |--- avg_price_per_room > 184.00 | | | | | | | | |--- weights: [0.45, 0.00] class: 0 | | | | |--- market_segment_type_Corporate > 0.50 | | | | | |--- repeated_guest <= 0.50 | | | | | | |--- no_of_adults <= 1.50 | | | | | | | |--- lead_time <= 19.50 | | | | | | | | |--- lead_time <= 5.50 | | | | | | | | | |--- lead_time <= 3.50 | | | | | | | | | | |--- avg_price_per_room <= 161.09 | | | | | | | | | | | |--- truncated branch of depth 13 | | | | | | | | | | |--- avg_price_per_room > 161.09 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- lead_time > 3.50 | | | | | | | | | | |--- avg_price_per_room <= 82.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- avg_price_per_room > 82.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | |--- lead_time > 5.50 | | | | | | | | | |--- avg_price_per_room <= 70.00 | | | | | | | | | | |--- avg_price_per_room <= 66.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- avg_price_per_room > 66.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- avg_price_per_room > 70.00 | | | | | | | | | | |--- arrival_date <= 29.00 | | | | | | | | | | | |--- weights: [16.95, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 29.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | |--- lead_time > 19.50 | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | |--- lead_time <= 48.50 | | | | | | | | | | |--- no_of_special_requests <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | | | | |--- no_of_special_requests > 0.50 | | | | | | | | | | | |--- weights: [0.90, 0.00] class: 0 | | | | | | | | | |--- lead_time > 48.50 | | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | |--- weights: [2.70, 0.00] class: 0 | | | | | | |--- no_of_adults > 1.50 | | | | | | | |--- arrival_date <= 7.50 | | | | | | | | |--- weights: [5.85, 0.00] class: 0 | | | | | | | |--- arrival_date > 7.50 | | | | | | | | |--- arrival_date <= 23.50 | | | | | | | | | |--- arrival_date <= 13.50 | | | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- arrival_date > 13.50 | | | | | | | | | | |--- arrival_date <= 18.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- arrival_date > 18.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | |--- arrival_date > 23.50 | | | | | | | | | |--- avg_price_per_room <= 182.50 | | | | | | | | | | |--- arrival_date <= 24.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_date > 24.50 | | | | | | | | | | | |--- weights: [3.75, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 182.50 | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | |--- repeated_guest > 0.50 | | | | | | |--- weights: [38.10, 0.00] class: 0 | | | |--- no_of_weekend_nights > 0.50 | | | | |--- lead_time <= 68.50 | | | | | |--- arrival_month <= 9.50 | | | | | | |--- no_of_special_requests <= 0.50 | | | | | | | |--- avg_price_per_room <= 63.29 | | | | | | | | |--- arrival_date <= 20.50 | | | | | | | | | |--- type_of_meal_plan_Not_Selected <= 0.50 | | | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | | | |--- weights: [1.95, 0.00] class: 0 | | | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | | | |--- weights: [6.45, 0.00] class: 0 | | | | | | | | | |--- type_of_meal_plan_Not_Selected > 0.50 | | | | | | | | | | |--- arrival_date <= 7.50 | | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | | | |--- arrival_date > 7.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- arrival_date > 20.50 | | | | | | | | | |--- no_of_week_nights <= 0.50 | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | |--- no_of_week_nights > 0.50 | | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | |--- avg_price_per_room > 63.29 | | | | | | | | |--- no_of_weekend_nights <= 3.50 | | | | | | | | | |--- arrival_month <= 2.50 | | | | | | | | | | |--- avg_price_per_room <= 73.00 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- avg_price_per_room > 73.00 | | | | | | | | | | | |--- weights: [9.60, 0.00] class: 0 | | | | | | | | | |--- arrival_month > 2.50 | | | | | | | | | | |--- lead_time <= 59.50 | | | | | | | | | | | |--- truncated branch of depth 17 | | | | | | | | | | |--- lead_time > 59.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | |--- no_of_weekend_nights > 3.50 | | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | | |--- weights: [0.00, 8.50] class: 1 | | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | |--- no_of_special_requests > 0.50 | | | | | | | |--- avg_price_per_room <= 127.00 | | | | | | | | |--- weights: [35.10, 0.00] class: 0 | | | | | | | |--- avg_price_per_room > 127.00 | | | | | | | | |--- market_segment_type_Corporate <= 0.50 | | | | | | | | | |--- type_of_meal_plan_Meal_Plan_2 <= 0.50 | | | | | | | | | | |--- weights: [1.80, 0.00] class: 0 | | | | | | | | | |--- type_of_meal_plan_Meal_Plan_2 > 0.50 | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | |--- market_segment_type_Corporate > 0.50 | | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | | |--- weights: [0.00, 1.70] class: 1 | | | | | |--- arrival_month > 9.50 | | | | | | |--- no_of_week_nights <= 11.50 | | | | | | | |--- lead_time <= 65.50 | | | | | | | | |--- avg_price_per_room <= 66.75 | | | | | | | | | |--- lead_time <= 3.50 | | | | | | | | | | |--- arrival_month <= 10.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- arrival_month > 10.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- lead_time > 3.50 | | | | | | | | | | |--- no_of_week_nights <= 0.50 | | | | | | | | | | | |--- weights: [3.75, 0.00] class: 0 | | | | | | | | | | |--- no_of_week_nights > 0.50 | | | | | | | | | | | |--- weights: [24.15, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 66.75 | | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | | |--- arrival_date <= 6.50 | | | | | | | | | | | |--- weights: [4.35, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 6.50 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | | |--- lead_time <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- lead_time > 0.50 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | |--- lead_time > 65.50 | | | | | | | | |--- arrival_month <= 10.50 | | | | | | | | | |--- room_type_reserved_Room_Type_3 <= 0.50 | | | | | | | | | | |--- no_of_week_nights <= 3.50 | | | | | | | | | | | |--- weights: [0.75, 1.70] class: 1 | | | | | | | | | | |--- no_of_week_nights > 3.50 | | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | | |--- room_type_reserved_Room_Type_3 > 0.50 | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | |--- arrival_month > 10.50 | | | | | | | | | |--- arrival_date <= 1.50 | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 1.50 | | | | | | | | | | |--- weights: [1.95, 0.00] class: 0 | | | | | | |--- no_of_week_nights > 11.50 | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | |--- lead_time > 68.50 | | | | | |--- avg_price_per_room <= 99.98 | | | | | | |--- arrival_month <= 3.50 | | | | | | | |--- avg_price_per_room <= 62.50 | | | | | | | | |--- weights: [3.60, 0.00] class: 0 | | | | | | | |--- avg_price_per_room > 62.50 | | | | | | | | |--- arrival_date <= 25.50 | | | | | | | | | |--- arrival_date <= 14.50 | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 14.50 | | | | | | | | | | |--- avg_price_per_room <= 87.00 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- avg_price_per_room > 87.00 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | |--- arrival_date > 25.50 | | | | | | | | | |--- weights: [0.90, 0.00] class: 0 | | | | | | |--- arrival_month > 3.50 | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | |--- arrival_date <= 25.50 | | | | | | | | | |--- lead_time <= 88.50 | | | | | | | | | | |--- weights: [12.60, 0.00] class: 0 | | | | | | | | | |--- lead_time > 88.50 | | | | | | | | | | |--- arrival_month <= 8.50 | | | | | | | | | | | |--- weights: [1.05, 0.00] class: 0 | | | | | | | | | | |--- arrival_month > 8.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- arrival_date > 25.50 | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | |--- avg_price_per_room <= 80.50 | | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | | | |--- avg_price_per_room > 80.50 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | |--- no_of_special_requests <= 0.50 | | | | | | | | | |--- lead_time <= 73.50 | | | | | | | | | | |--- weights: [0.00, 2.55] class: 1 | | | | | | | | | |--- lead_time > 73.50 | | | | | | | | | | |--- lead_time <= 81.50 | | | | | | | | | | | |--- weights: [2.70, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 81.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | |--- no_of_special_requests > 0.50 | | | | | | | | | |--- weights: [4.20, 0.00] class: 0 | | | | | |--- avg_price_per_room > 99.98 | | | | | | |--- arrival_year <= 2017.50 | | | | | | | |--- weights: [1.80, 0.00] class: 0 | | | | | | |--- arrival_year > 2017.50 | | | | | | | |--- no_of_special_requests <= 0.50 | | | | | | | | |--- avg_price_per_room <= 132.43 | | | | | | | | | |--- arrival_month <= 2.50 | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | |--- arrival_month > 2.50 | | | | | | | | | | |--- no_of_children <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | | | | |--- no_of_children > 0.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 132.43 | | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | | |--- weights: [0.90, 0.00] class: 0 | | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | | |--- weights: [0.45, 0.00] class: 0 | | | | | | | |--- no_of_special_requests > 0.50 | | | | | | | | |--- arrival_month <= 3.50 | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | |--- arrival_month > 3.50 | | | | | | | | | |--- weights: [1.20, 0.00] class: 0 | | |--- market_segment_type_Online > 0.50 | | | |--- no_of_special_requests <= 0.50 | | | | |--- lead_time <= 3.50 | | | | | |--- avg_price_per_room <= 202.67 | | | | | | |--- arrival_month <= 8.50 | | | | | | | |--- arrival_month <= 1.50 | | | | | | | | |--- weights: [10.05, 0.00] class: 0 | | | | | | | |--- arrival_month > 1.50 | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | |--- avg_price_per_room <= 74.57 | | | | | | | | | | |--- weights: [5.55, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 74.57 | | | | | | | | | | |--- avg_price_per_room <= 164.33 | | | | | | | | | | | |--- truncated branch of depth 19 | | | | | | | | | | |--- avg_price_per_room > 164.33 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | |--- lead_time <= 0.50 | | | | | | | | | | |--- weights: [2.25, 0.00] class: 0 | | | | | | | | | |--- lead_time > 0.50 | | | | | | | | | | |--- arrival_date <= 26.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- arrival_date > 26.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | |--- arrival_month > 8.50 | | | | | | | |--- avg_price_per_room <= 169.67 | | | | | | | | |--- avg_price_per_room <= 94.66 | | | | | | | | | |--- weights: [11.70, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 94.66 | | | | | | | | | |--- arrival_month <= 10.50 | | | | | | | | | | |--- arrival_date <= 28.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- arrival_date > 28.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- arrival_month > 10.50 | | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | |--- avg_price_per_room > 169.67 | | | | | | | | |--- avg_price_per_room <= 182.25 | | | | | | | | | |--- arrival_date <= 11.00 | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 11.00 | | | | | | | | | | |--- arrival_date <= 24.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- arrival_date > 24.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 182.25 | | | | | | | | | |--- weights: [1.05, 0.00] class: 0 | | | | | |--- avg_price_per_room > 202.67 | | | | | | |--- arrival_month <= 11.00 | | | | | | | |--- weights: [0.00, 12.75] class: 1 | | | | | | |--- arrival_month > 11.00 | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | |--- lead_time > 3.50 | | | | | |--- arrival_year <= 2017.50 | | | | | | |--- lead_time <= 62.50 | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | |--- arrival_date <= 7.50 | | | | | | | | | |--- weights: [3.30, 0.00] class: 0 | | | | | | | | |--- arrival_date > 7.50 | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | |--- no_of_week_nights <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- no_of_week_nights > 0.50 | | | | | | | | | | | |--- weights: [3.30, 0.00] class: 0 | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | |--- avg_price_per_room <= 196.89 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | | |--- avg_price_per_room > 196.89 | | | | | | | | | | | |--- weights: [0.00, 3.40] class: 1 | | | | | | | |--- arrival_month > 9.50 | | | | | | | | |--- avg_price_per_room <= 129.92 | | | | | | | | | |--- lead_time <= 26.50 | | | | | | | | | | |--- weights: [19.95, 0.00] class: 0 | | | | | | | | | |--- lead_time > 26.50 | | | | | | | | | | |--- arrival_date <= 6.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- arrival_date > 6.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | |--- avg_price_per_room > 129.92 | | | | | | | | | |--- avg_price_per_room <= 145.50 | | | | | | | | | | |--- lead_time <= 9.50 | | | | | | | | | | | |--- weights: [0.45, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 9.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- avg_price_per_room > 145.50 | | | | | | | | | | |--- weights: [1.35, 0.00] class: 0 | | | | | | |--- lead_time > 62.50 | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | |--- avg_price_per_room <= 47.63 | | | | | | | | | |--- weights: [0.60, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 47.63 | | | | | | | | | |--- no_of_children <= 0.50 | | | | | | | | | | |--- avg_price_per_room <= 78.62 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | | |--- avg_price_per_room > 78.62 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | |--- no_of_children > 0.50 | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | |--- arrival_date > 27.50 | | | | | | | | |--- weights: [1.65, 0.00] class: 0 | | | | | |--- arrival_year > 2017.50 | | | | | | |--- arrival_month <= 1.50 | | | | | | | |--- lead_time <= 24.50 | | | | | | | | |--- weights: [14.40, 0.00] class: 0 | | | | | | | |--- lead_time > 24.50 | | | | | | | | |--- avg_price_per_room <= 57.94 | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 57.94 | | | | | | | | | |--- arrival_date <= 9.50 | | | | | | | | | | |--- avg_price_per_room <= 142.12 | | | | | | | | | | | |--- weights: [0.90, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 142.12 | | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | | |--- arrival_date > 9.50 | | | | | | | | | | |--- arrival_date <= 30.00 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- arrival_date > 30.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | |--- arrival_month > 1.50 | | | | | | | |--- avg_price_per_room <= 61.67 | | | | | | | | |--- avg_price_per_room <= 47.20 | | | | | | | | | |--- weights: [3.15, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 47.20 | | | | | | | | | |--- lead_time <= 51.50 | | | | | | | | | | |--- avg_price_per_room <= 57.43 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- avg_price_per_room > 57.43 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- lead_time > 51.50 | | | | | | | | | | |--- avg_price_per_room <= 49.01 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 49.01 | | | | | | | | | | | |--- weights: [1.80, 0.00] class: 0 | | | | | | | |--- avg_price_per_room > 61.67 | | | | | | | | |--- required_car_parking_space <= 0.50 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- lead_time <= 9.50 | | | | | | | | | | | |--- truncated branch of depth 17 | | | | | | | | | | |--- lead_time > 9.50 | | | | | | | | | | | |--- truncated branch of depth 34 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- lead_time <= 24.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- lead_time > 24.50 | | | | | | | | | | | |--- truncated branch of depth 12 | | | | | | | | |--- required_car_parking_space > 0.50 | | | | | | | | | |--- avg_price_per_room <= 204.50 | | | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | | | |--- weights: [0.45, 0.00] class: 0 | | | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | | | |--- weights: [5.55, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 204.50 | | | | | | | | | | |--- weights: [0.00, 1.70] class: 1 | | | |--- no_of_special_requests > 0.50 | | | | |--- arrival_month <= 11.50 | | | | | |--- lead_time <= 6.50 | | | | | | |--- avg_price_per_room <= 157.64 | | | | | | | |--- no_of_week_nights <= 10.00 | | | | | | | | |--- lead_time <= 4.50 | | | | | | | | | |--- room_type_reserved_Room_Type_2 <= 0.50 | | | | | | | | | | |--- arrival_date <= 4.50 | | | | | | | | | | | |--- weights: [11.70, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 4.50 | | | | | | | | | | | |--- truncated branch of depth 14 | | | | | | | | | |--- room_type_reserved_Room_Type_2 > 0.50 | | | | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | |--- lead_time > 4.50 | | | | | | | | | |--- arrival_date <= 13.50 | | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | | |--- truncated branch of depth 12 | | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | | |--- weights: [1.65, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 13.50 | | | | | | | | | | |--- avg_price_per_room <= 138.79 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- avg_price_per_room > 138.79 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | |--- no_of_week_nights > 10.00 | | | | | | | | |--- lead_time <= 4.50 | | | | | | | | | |--- arrival_date <= 21.50 | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | | |--- arrival_date > 21.50 | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | |--- lead_time > 4.50 | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | |--- avg_price_per_room > 157.64 | | | | | | | |--- arrival_date <= 18.50 | | | | | | | | |--- arrival_date <= 10.00 | | | | | | | | | |--- arrival_month <= 10.50 | | | | | | | | | | |--- weights: [3.45, 0.00] class: 0 | | | | | | | | | |--- arrival_month > 10.50 | | | | | | | | | | |--- lead_time <= 4.00 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 4.00 | | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | |--- arrival_date > 10.00 | | | | | | | | | |--- arrival_date <= 16.50 | | | | | | | | | | |--- avg_price_per_room <= 186.67 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- avg_price_per_room > 186.67 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- arrival_date > 16.50 | | | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | |--- arrival_date > 18.50 | | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | | |--- lead_time <= 1.50 | | | | | | | | | | |--- arrival_date <= 20.50 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 20.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- lead_time > 1.50 | | | | | | | | | | |--- no_of_adults <= 2.50 | | | | | | | | | | | |--- weights: [1.20, 0.00] class: 0 | | | | | | | | | | |--- no_of_adults > 2.50 | | | | | | | | | | | |--- weights: [0.60, 0.00] class: 0 | | | | | | | | |--- arrival_month > 5.50 | | | | | | | | | |--- weights: [5.25, 0.00] class: 0 | | | | | |--- lead_time > 6.50 | | | | | | |--- required_car_parking_space <= 0.50 | | | | | | | |--- arrival_month <= 1.50 | | | | | | | | |--- weights: [15.60, 0.00] class: 0 | | | | | | | |--- arrival_month > 1.50 | | | | | | | | |--- avg_price_per_room <= 121.78 | | | | | | | | | |--- no_of_week_nights <= 4.50 | | | | | | | | | | |--- avg_price_per_room <= 67.36 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | | |--- avg_price_per_room > 67.36 | | | | | | | | | | | |--- truncated branch of depth 30 | | | | | | | | | |--- no_of_week_nights > 4.50 | | | | | | | | | | |--- no_of_weekend_nights <= 2.50 | | | | | | | | | | | |--- truncated branch of depth 18 | | | | | | | | | | |--- no_of_weekend_nights > 2.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | |--- avg_price_per_room > 121.78 | | | | | | | | | |--- arrival_month <= 8.50 | | | | | | | | | | |--- arrival_date <= 19.50 | | | | | | | | | | | |--- truncated branch of depth 24 | | | | | | | | | | |--- arrival_date > 19.50 | | | | | | | | | | | |--- truncated branch of depth 19 | | | | | | | | | |--- arrival_month > 8.50 | | | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | | | |--- truncated branch of depth 23 | | | | | | |--- required_car_parking_space > 0.50 | | | | | | | |--- weights: [19.65, 0.00] class: 0 | | | | |--- arrival_month > 11.50 | | | | | |--- no_of_weekend_nights <= 4.50 | | | | | | |--- weights: [59.40, 0.00] class: 0 | | | | | |--- no_of_weekend_nights > 4.50 | | | | | | |--- weights: [0.00, 0.85] class: 1 | |--- no_of_special_requests > 1.50 | | |--- no_of_week_nights <= 3.50 | | | |--- weights: [318.90, 0.00] class: 0 | | |--- no_of_week_nights > 3.50 | | | |--- no_of_special_requests <= 2.50 | | | | |--- lead_time <= 4.50 | | | | | |--- weights: [4.50, 0.00] class: 0 | | | | |--- lead_time > 4.50 | | | | | |--- arrival_date <= 3.50 | | | | | | |--- weights: [3.30, 0.00] class: 0 | | | | | |--- arrival_date > 3.50 | | | | | | |--- room_type_reserved_Room_Type_4 <= 0.50 | | | | | | | |--- arrival_date <= 19.50 | | | | | | | | |--- avg_price_per_room <= 70.49 | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 70.49 | | | | | | | | | |--- avg_price_per_room <= 93.40 | | | | | | | | | | |--- lead_time <= 15.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- lead_time > 15.00 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | |--- avg_price_per_room > 93.40 | | | | | | | | | | |--- avg_price_per_room <= 107.29 | | | | | | | | | | | |--- weights: [2.70, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 107.29 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | |--- arrival_date > 19.50 | | | | | | | | |--- lead_time <= 20.00 | | | | | | | | | |--- lead_time <= 8.50 | | | | | | | | | | |--- weights: [0.90, 0.00] class: 0 | | | | | | | | | |--- lead_time > 8.50 | | | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | |--- lead_time > 20.00 | | | | | | | | | |--- lead_time <= 54.00 | | | | | | | | | | |--- no_of_week_nights <= 4.50 | | | | | | | | | | | |--- weights: [4.35, 0.00] class: 0 | | | | | | | | | | |--- no_of_week_nights > 4.50 | | | | | | | | | | | |--- weights: [0.45, 0.00] class: 0 | | | | | | | | | |--- lead_time > 54.00 | | | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | | | |--- weights: [1.20, 0.00] class: 0 | | | | | | |--- room_type_reserved_Room_Type_4 > 0.50 | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | |--- required_car_parking_space <= 0.50 | | | | | | | | | |--- weights: [4.50, 0.00] class: 0 | | | | | | | | |--- required_car_parking_space > 0.50 | | | | | | | | | |--- weights: [0.45, 0.00] class: 0 | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | |--- avg_price_per_room <= 122.10 | | | | | | | | | |--- weights: [1.65, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 122.10 | | | | | | | | | |--- avg_price_per_room <= 153.00 | | | | | | | | | | |--- arrival_date <= 7.00 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 7.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- avg_price_per_room > 153.00 | | | | | | | | | | |--- avg_price_per_room <= 157.44 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 157.44 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | |--- no_of_special_requests > 2.50 | | | | |--- room_type_reserved_Room_Type_6 <= 0.50 | | | | | |--- weights: [9.75, 0.00] class: 0 | | | | |--- room_type_reserved_Room_Type_6 > 0.50 | | | | | |--- weights: [0.75, 0.00] class: 0 |--- lead_time > 90.50 | |--- lead_time <= 151.50 | | |--- no_of_special_requests <= 0.50 | | | |--- arrival_month <= 1.50 | | | | |--- weights: [14.55, 0.00] class: 0 | | | |--- arrival_month > 1.50 | | | | |--- market_segment_type_Online <= 0.50 | | | | | |--- arrival_month <= 11.50 | | | | | | |--- lead_time <= 116.50 | | | | | | | |--- lead_time <= 99.50 | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | |--- arrival_date <= 21.50 | | | | | | | | | | |--- arrival_month <= 7.00 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- arrival_month > 7.00 | | | | | | | | | | | |--- weights: [4.50, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 21.50 | | | | | | | | | | |--- no_of_week_nights <= 3.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- no_of_week_nights > 3.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | |--- arrival_date <= 28.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- arrival_date > 28.50 | | | | | | | | | | | |--- weights: [1.65, 0.00] class: 0 | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | |--- type_of_meal_plan_Not_Selected <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- type_of_meal_plan_Not_Selected > 0.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | |--- lead_time > 99.50 | | | | | | | | |--- avg_price_per_room <= 58.75 | | | | | | | | | |--- weights: [1.35, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 58.75 | | | | | | | | | |--- room_type_reserved_Room_Type_4 <= 0.50 | | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 13 | | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | |--- room_type_reserved_Room_Type_4 > 0.50 | | | | | | | | | | |--- lead_time <= 115.50 | | | | | | | | | | | |--- weights: [1.65, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 115.50 | | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | |--- lead_time > 116.50 | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | |--- avg_price_per_room <= 122.00 | | | | | | | | | |--- room_type_reserved_Room_Type_5 <= 0.50 | | | | | | | | | | |--- weights: [12.90, 0.00] class: 0 | | | | | | | | | |--- room_type_reserved_Room_Type_5 > 0.50 | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 122.00 | | | | | | | | | |--- weights: [0.00, 1.70] class: 1 | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | | | |--- avg_price_per_room <= 89.88 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | | | | |--- avg_price_per_room > 89.88 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | |--- arrival_date > 27.50 | | | | | | | | | | |--- lead_time <= 140.00 | | | | | | | | | | | |--- weights: [0.90, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 140.00 | | | | | | | | | | | |--- weights: [1.95, 0.85] class: 0 | | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | | |--- avg_price_per_room <= 74.12 | | | | | | | | | | |--- no_of_week_nights <= 3.50 | | | | | | | | | | | |--- weights: [1.05, 0.00] class: 0 | | | | | | | | | | |--- no_of_week_nights > 3.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- avg_price_per_room > 74.12 | | | | | | | | | | |--- lead_time <= 146.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- lead_time > 146.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | |--- arrival_month > 11.50 | | | | | | |--- avg_price_per_room <= 177.83 | | | | | | | |--- weights: [15.60, 0.00] class: 0 | | | | | | |--- avg_price_per_room > 177.83 | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | |--- market_segment_type_Online > 0.50 | | | | | |--- required_car_parking_space <= 0.50 | | | | | | |--- avg_price_per_room <= 71.92 | | | | | | | |--- arrival_date <= 13.50 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | | | |--- weights: [0.00, 1.70] class: 1 | | | | | | | | | |--- arrival_month > 7.50 | | | | | | | | | | |--- arrival_date <= 6.00 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 6.00 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- arrival_month <= 3.50 | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | |--- arrival_month > 3.50 | | | | | | | | | | |--- weights: [2.25, 0.00] class: 0 | | | | | | | |--- arrival_date > 13.50 | | | | | | | | |--- lead_time <= 142.00 | | | | | | | | | |--- avg_price_per_room <= 70.29 | | | | | | | | | | |--- type_of_meal_plan_Not_Selected <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | | | | |--- type_of_meal_plan_Not_Selected > 0.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- avg_price_per_room > 70.29 | | | | | | | | | | |--- weights: [0.45, 0.00] class: 0 | | | | | | | | |--- lead_time > 142.00 | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | |--- avg_price_per_room > 71.92 | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | |--- room_type_reserved_Room_Type_2 <= 0.50 | | | | | | | | | |--- avg_price_per_room <= 114.65 | | | | | | | | | | |--- no_of_children <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 18 | | | | | | | | | | |--- no_of_children > 0.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- avg_price_per_room > 114.65 | | | | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | |--- room_type_reserved_Room_Type_2 > 0.50 | | | | | | | | | |--- arrival_date <= 20.00 | | | | | | | | | | |--- no_of_children <= 0.50 | | | | | | | | | | | |--- weights: [1.20, 0.00] class: 0 | | | | | | | | | | |--- no_of_children > 0.50 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 20.00 | | | | | | | | | | |--- avg_price_per_room <= 87.89 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 87.89 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | |--- arrival_month > 5.50 | | | | | | | | |--- lead_time <= 91.50 | | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | | | |--- weights: [0.00, 5.95] class: 1 | | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | | |--- arrival_date <= 22.50 | | | | | | | | | | | |--- weights: [1.35, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 22.50 | | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | |--- lead_time > 91.50 | | | | | | | | | |--- avg_price_per_room <= 99.38 | | | | | | | | | | |--- avg_price_per_room <= 97.95 | | | | | | | | | | | |--- truncated branch of depth 17 | | | | | | | | | | |--- avg_price_per_room > 97.95 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- avg_price_per_room > 99.38 | | | | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | | | | |--- truncated branch of depth 12 | | | | | | | | | | |--- arrival_month > 7.50 | | | | | | | | | | | |--- truncated branch of depth 15 | | | | | |--- required_car_parking_space > 0.50 | | | | | | |--- no_of_week_nights <= 8.00 | | | | | | | |--- weights: [2.85, 0.00] class: 0 | | | | | | |--- no_of_week_nights > 8.00 | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | |--- no_of_special_requests > 0.50 | | | |--- no_of_special_requests <= 2.50 | | | | |--- arrival_month <= 8.50 | | | | | |--- arrival_year <= 2017.50 | | | | | | |--- type_of_meal_plan_Meal_Plan_2 <= 0.50 | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | |--- avg_price_per_room <= 63.45 | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 63.45 | | | | | | | | | |--- required_car_parking_space <= 0.50 | | | | | | | | | | |--- arrival_date <= 1.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 1.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- required_car_parking_space > 0.50 | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | |--- arrival_month > 7.50 | | | | | | | | |--- lead_time <= 110.50 | | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | | |--- weights: [1.65, 0.00] class: 0 | | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | | |--- lead_time <= 99.50 | | | | | | | | | | | |--- weights: [0.00, 1.70] class: 1 | | | | | | | | | | |--- lead_time > 99.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- lead_time > 110.50 | | | | | | | | | |--- lead_time <= 149.00 | | | | | | | | | | |--- no_of_week_nights <= 4.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- no_of_week_nights > 4.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- lead_time > 149.00 | | | | | | | | | | |--- arrival_date <= 11.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 11.50 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | |--- type_of_meal_plan_Meal_Plan_2 > 0.50 | | | | | | | |--- weights: [0.90, 0.00] class: 0 | | | | | |--- arrival_year > 2017.50 | | | | | | |--- lead_time <= 142.50 | | | | | | | |--- avg_price_per_room <= 200.86 | | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | | |--- avg_price_per_room <= 71.96 | | | | | | | | | | |--- weights: [5.85, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 71.96 | | | | | | | | | | |--- avg_price_per_room <= 72.37 | | | | | | | | | | | |--- weights: [0.00, 3.40] class: 1 | | | | | | | | | | |--- avg_price_per_room > 72.37 | | | | | | | | | | | |--- truncated branch of depth 18 | | | | | | | | |--- arrival_month > 5.50 | | | | | | | | | |--- arrival_date <= 19.50 | | | | | | | | | | |--- avg_price_per_room <= 92.08 | | | | | | | | | | | |--- weights: [7.65, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 92.08 | | | | | | | | | | | |--- truncated branch of depth 11 | | | | | | | | | |--- arrival_date > 19.50 | | | | | | | | | | |--- avg_price_per_room <= 130.95 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | | |--- avg_price_per_room > 130.95 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | |--- avg_price_per_room > 200.86 | | | | | | | | |--- weights: [0.00, 5.95] class: 1 | | | | | | |--- lead_time > 142.50 | | | | | | | |--- avg_price_per_room <= 100.25 | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | |--- weights: [4.05, 0.00] class: 0 | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | |--- room_type_reserved_Room_Type_2 <= 0.50 | | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- room_type_reserved_Room_Type_2 > 0.50 | | | | | | | | | | |--- avg_price_per_room <= 43.31 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 43.31 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | |--- avg_price_per_room > 100.25 | | | | | | | | |--- required_car_parking_space <= 0.50 | | | | | | | | | |--- arrival_date <= 7.50 | | | | | | | | | | |--- lead_time <= 143.50 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 143.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- arrival_date > 7.50 | | | | | | | | | | |--- lead_time <= 150.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- lead_time > 150.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- required_car_parking_space > 0.50 | | | | | | | | | |--- weights: [1.50, 0.00] class: 0 | | | | |--- arrival_month > 8.50 | | | | | |--- avg_price_per_room <= 71.12 | | | | | | |--- no_of_week_nights <= 4.50 | | | | | | | |--- arrival_date <= 8.50 | | | | | | | | |--- arrival_date <= 6.00 | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | |--- arrival_date > 6.00 | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | |--- arrival_date > 8.50 | | | | | | | | |--- room_type_reserved_Room_Type_2 <= 0.50 | | | | | | | | | |--- weights: [3.90, 0.00] class: 0 | | | | | | | | |--- room_type_reserved_Room_Type_2 > 0.50 | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | |--- no_of_week_nights > 4.50 | | | | | | | |--- weights: [0.00, 1.70] class: 1 | | | | | |--- avg_price_per_room > 71.12 | | | | | | |--- required_car_parking_space <= 0.50 | | | | | | | |--- room_type_reserved_Room_Type_2 <= 0.50 | | | | | | | | |--- arrival_date <= 3.50 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | | | |--- weights: [2.25, 0.00] class: 0 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- type_of_meal_plan_Not_Selected <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- type_of_meal_plan_Not_Selected > 0.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- arrival_date > 3.50 | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 23 | | | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | | | |--- truncated branch of depth 11 | | | | | | | |--- room_type_reserved_Room_Type_2 > 0.50 | | | | | | | | |--- arrival_date <= 6.50 | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | |--- arrival_date > 6.50 | | | | | | | | | |--- weights: [1.65, 0.00] class: 0 | | | | | | |--- required_car_parking_space > 0.50 | | | | | | | |--- no_of_special_requests <= 1.50 | | | | | | | | |--- weights: [0.60, 0.00] class: 0 | | | | | | | |--- no_of_special_requests > 1.50 | | | | | | | | |--- weights: [0.45, 0.00] class: 0 | | | |--- no_of_special_requests > 2.50 | | | | |--- weights: [13.50, 0.00] class: 0 | |--- lead_time > 151.50 | | |--- avg_price_per_room <= 100.04 | | | |--- no_of_special_requests <= 0.50 | | | | |--- no_of_adults <= 1.50 | | | | | |--- market_segment_type_Online <= 0.50 | | | | | | |--- lead_time <= 163.50 | | | | | | | |--- arrival_month <= 5.00 | | | | | | | | |--- weights: [0.60, 0.00] class: 0 | | | | | | | |--- arrival_month > 5.00 | | | | | | | | |--- avg_price_per_room <= 80.00 | | | | | | | | | |--- weights: [0.15, 0.85] class: 1 | | | | | | | | |--- avg_price_per_room > 80.00 | | | | | | | | | |--- weights: [0.00, 12.75] class: 1 | | | | | | |--- lead_time > 163.50 | | | | | | | |--- lead_time <= 341.00 | | | | | | | | |--- lead_time <= 173.00 | | | | | | | | | |--- arrival_date <= 3.50 | | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- arrival_date > 3.50 | | | | | | | | | | |--- arrival_month <= 5.00 | | | | | | | | | | | |--- weights: [0.45, 0.00] class: 0 | | | | | | | | | | |--- arrival_month > 5.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- lead_time > 173.00 | | | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | | | |--- avg_price_per_room <= 88.00 | | | | | | | | | | | |--- weights: [1.35, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 88.00 | | | | | | | | | | | |--- weights: [0.00, 2.55] class: 1 | | | | | | | | | |--- arrival_month > 5.50 | | | | | | | | | | |--- avg_price_per_room <= 98.00 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | | |--- avg_price_per_room > 98.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | |--- lead_time > 341.00 | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | |--- avg_price_per_room <= 88.00 | | | | | | | | | | |--- weights: [0.00, 8.50] class: 1 | | | | | | | | | |--- avg_price_per_room > 88.00 | | | | | | | | | | |--- arrival_date <= 8.50 | | | | | | | | | | | |--- weights: [0.15, 0.85] class: 1 | | | | | | | | | | |--- arrival_date > 8.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | |--- avg_price_per_room <= 80.00 | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 80.00 | | | | | | | | | | |--- weights: [0.45, 1.70] class: 1 | | | | | |--- market_segment_type_Online > 0.50 | | | | | | |--- avg_price_per_room <= 2.50 | | | | | | | |--- lead_time <= 285.50 | | | | | | | | |--- weights: [1.65, 0.00] class: 0 | | | | | | | |--- lead_time > 285.50 | | | | | | | | |--- type_of_meal_plan_Meal_Plan_2 <= 0.50 | | | | | | | | | |--- weights: [0.00, 1.70] class: 1 | | | | | | | | |--- type_of_meal_plan_Meal_Plan_2 > 0.50 | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | |--- avg_price_per_room > 2.50 | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | |--- weights: [0.00, 49.30] class: 1 | | | | | | | |--- arrival_month > 11.50 | | | | | | | | |--- avg_price_per_room <= 84.30 | | | | | | | | | |--- weights: [0.00, 5.10] class: 1 | | | | | | | | |--- avg_price_per_room > 84.30 | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | |--- no_of_adults > 1.50 | | | | | |--- arrival_year <= 2017.50 | | | | | | |--- lead_time <= 215.50 | | | | | | | |--- lead_time <= 164.00 | | | | | | | | |--- market_segment_type_Online <= 0.50 | | | | | | | | | |--- lead_time <= 161.00 | | | | | | | | | | |--- avg_price_per_room <= 63.26 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 63.26 | | | | | | | | | | | |--- weights: [1.95, 0.00] class: 0 | | | | | | | | | |--- lead_time > 161.00 | | | | | | | | | | |--- avg_price_per_room <= 52.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 52.50 | | | | | | | | | | | |--- weights: [1.05, 0.85] class: 0 | | | | | | | | |--- market_segment_type_Online > 0.50 | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | |--- lead_time > 164.00 | | | | | | | | |--- arrival_date <= 9.50 | | | | | | | | | |--- avg_price_per_room <= 74.62 | | | | | | | | | | |--- market_segment_type_Online <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- market_segment_type_Online > 0.50 | | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | | |--- avg_price_per_room > 74.62 | | | | | | | | | | |--- weights: [0.00, 11.90] class: 1 | | | | | | | | |--- arrival_date > 9.50 | | | | | | | | | |--- weights: [0.00, 44.20] class: 1 | | | | | | |--- lead_time > 215.50 | | | | | | | |--- market_segment_type_Online <= 0.50 | | | | | | | | |--- avg_price_per_room <= 66.00 | | | | | | | | | |--- weights: [4.35, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 66.00 | | | | | | | | | |--- weights: [6.60, 0.00] class: 0 | | | | | | | |--- market_segment_type_Online > 0.50 | | | | | | | | |--- weights: [0.00, 5.10] class: 1 | | | | | |--- arrival_year > 2017.50 | | | | | | |--- avg_price_per_room <= 82.47 | | | | | | | |--- lead_time <= 203.50 | | | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | | | |--- weights: [0.00, 34.85] class: 1 | | | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | | | |--- repeated_guest <= 0.50 | | | | | | | | | | |--- lead_time <= 170.50 | | | | | | | | | | | |--- weights: [5.70, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 170.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- repeated_guest > 0.50 | | | | | | | | | | |--- weights: [0.00, 2.55] class: 1 | | | | | | | |--- lead_time > 203.50 | | | | | | | | |--- arrival_date <= 3.50 | | | | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | | | | |--- no_of_week_nights <= 2.00 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | | |--- no_of_week_nights > 2.00 | | | | | | | | | | | |--- weights: [0.00, 7.65] class: 1 | | | | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | | | | |--- weights: [3.15, 0.00] class: 0 | | | | | | | | |--- arrival_date > 3.50 | | | | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | | | |--- weights: [0.00, 71.40] class: 1 | | | | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | |--- avg_price_per_room > 82.47 | | | | | | | |--- no_of_adults <= 2.50 | | | | | | | | |--- market_segment_type_Corporate <= 0.50 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- room_type_reserved_Room_Type_4 <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | | |--- room_type_reserved_Room_Type_4 > 0.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- market_segment_type_Online <= 0.50 | | | | | | | | | | | |--- weights: [1.05, 0.00] class: 0 | | | | | | | | | | |--- market_segment_type_Online > 0.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- market_segment_type_Corporate > 0.50 | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | |--- no_of_adults > 2.50 | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | |--- no_of_special_requests > 0.50 | | | | |--- no_of_weekend_nights <= 0.50 | | | | | |--- lead_time <= 180.50 | | | | | | |--- lead_time <= 159.50 | | | | | | | |--- arrival_month <= 8.50 | | | | | | | | |--- weights: [1.20, 0.00] class: 0 | | | | | | | |--- arrival_month > 8.50 | | | | | | | | |--- avg_price_per_room <= 74.05 | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 74.05 | | | | | | | | | |--- lead_time <= 158.50 | | | | | | | | | | |--- weights: [0.00, 3.40] class: 1 | | | | | | | | | |--- lead_time > 158.50 | | | | | | | | | | |--- avg_price_per_room <= 89.62 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 89.62 | | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | |--- lead_time > 159.50 | | | | | | | |--- arrival_date <= 1.50 | | | | | | | | |--- lead_time <= 176.50 | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | |--- lead_time > 176.50 | | | | | | | | | |--- weights: [0.00, 1.70] class: 1 | | | | | | | |--- arrival_date > 1.50 | | | | | | | | |--- no_of_adults <= 0.50 | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | |--- no_of_adults > 0.50 | | | | | | | | | |--- weights: [7.20, 0.00] class: 0 | | | | | |--- lead_time > 180.50 | | | | | | |--- no_of_special_requests <= 2.50 | | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | | |--- no_of_week_nights <= 0.50 | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | |--- no_of_week_nights > 0.50 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- weights: [0.00, 106.25] class: 1 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | | |--- avg_price_per_room <= 86.22 | | | | | | | | | |--- arrival_month <= 4.50 | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | |--- arrival_month > 4.50 | | | | | | | | | | |--- weights: [1.50, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 86.22 | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | |--- weights: [0.45, 0.00] class: 0 | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | |--- room_type_reserved_Room_Type_4 <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- room_type_reserved_Room_Type_4 > 0.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | |--- no_of_special_requests > 2.50 | | | | | | | |--- arrival_date <= 6.00 | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | |--- arrival_date > 6.00 | | | | | | | | |--- weights: [1.65, 0.00] class: 0 | | | | |--- no_of_weekend_nights > 0.50 | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | |--- arrival_date <= 24.50 | | | | | | | |--- avg_price_per_room <= 74.73 | | | | | | | | |--- lead_time <= 199.50 | | | | | | | | | |--- weights: [4.80, 0.00] class: 0 | | | | | | | | |--- lead_time > 199.50 | | | | | | | | | |--- no_of_special_requests <= 1.50 | | | | | | | | | | |--- no_of_adults <= 0.50 | | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | | | |--- no_of_adults > 0.50 | | | | | | | | | | | |--- weights: [3.00, 0.00] class: 0 | | | | | | | | | |--- no_of_special_requests > 1.50 | | | | | | | | | | |--- avg_price_per_room <= 62.02 | | | | | | | | | | | |--- weights: [0.60, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 62.02 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | |--- avg_price_per_room > 74.73 | | | | | | | | |--- no_of_week_nights <= 6.50 | | | | | | | | | |--- arrival_date <= 1.50 | | | | | | | | | | |--- weights: [1.95, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 1.50 | | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | | |--- truncated branch of depth 15 | | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | |--- no_of_week_nights > 6.50 | | | | | | | | | |--- no_of_special_requests <= 2.50 | | | | | | | | | | |--- arrival_date <= 16.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- arrival_date > 16.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- no_of_special_requests > 2.50 | | | | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | | |--- arrival_month > 7.50 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | |--- arrival_date > 24.50 | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | |--- no_of_special_requests <= 1.50 | | | | | | | | | |--- lead_time <= 165.00 | | | | | | | | | | |--- arrival_date <= 28.50 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 28.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- lead_time > 165.00 | | | | | | | | | | |--- room_type_reserved_Room_Type_4 <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- room_type_reserved_Room_Type_4 > 0.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- no_of_special_requests > 1.50 | | | | | | | | | |--- lead_time <= 197.50 | | | | | | | | | | |--- weights: [0.45, 0.00] class: 0 | | | | | | | | | |--- lead_time > 197.50 | | | | | | | | | | |--- weights: [0.00, 2.55] class: 1 | | | | | | | |--- arrival_month > 7.50 | | | | | | | | |--- no_of_special_requests <= 2.50 | | | | | | | | | |--- avg_price_per_room <= 55.92 | | | | | | | | | | |--- weights: [0.45, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 55.92 | | | | | | | | | | |--- no_of_week_nights <= 0.50 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | | |--- no_of_week_nights > 0.50 | | | | | | | | | | | |--- truncated branch of depth 13 | | | | | | | | |--- no_of_special_requests > 2.50 | | | | | | | | | |--- weights: [0.90, 0.00] class: 0 | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | |--- lead_time <= 348.50 | | | | | | | |--- no_of_special_requests <= 1.50 | | | | | | | | |--- weights: [19.50, 0.00] class: 0 | | | | | | | |--- no_of_special_requests > 1.50 | | | | | | | | |--- no_of_week_nights <= 3.50 | | | | | | | | | |--- weights: [1.35, 0.00] class: 0 | | | | | | | | |--- no_of_week_nights > 3.50 | | | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | | | |--- arrival_date <= 26.50 | | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | | | |--- arrival_date > 26.50 | | | | | | | | | | | |--- weights: [0.30, 0.85] class: 1 | | | | | | | | | |--- arrival_month > 7.50 | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | |--- lead_time > 348.50 | | | | | | | |--- avg_price_per_room <= 58.50 | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | |--- avg_price_per_room > 58.50 | | | | | | | | |--- lead_time <= 372.50 | | | | | | | | | |--- weights: [0.90, 1.70] class: 1 | | | | | | | | |--- lead_time > 372.50 | | | | | | | | | |--- weights: [0.15, 0.85] class: 1 | | |--- avg_price_per_room > 100.04 | | | |--- no_of_special_requests <= 2.50 | | | | |--- arrival_month <= 11.50 | | | | | |--- weights: [0.00, 1791.80] class: 1 | | | | |--- arrival_month > 11.50 | | | | | |--- no_of_special_requests <= 0.50 | | | | | | |--- weights: [7.05, 0.00] class: 0 | | | | | |--- no_of_special_requests > 0.50 | | | | | | |--- arrival_date <= 24.50 | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | |--- arrival_date > 24.50 | | | | | | | |--- lead_time <= 153.50 | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | |--- lead_time > 153.50 | | | | | | | | |--- lead_time <= 172.50 | | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | |--- lead_time > 172.50 | | | | | | | | | |--- no_of_special_requests <= 1.50 | | | | | | | | | | |--- weights: [0.00, 7.65] class: 1 | | | | | | | | | |--- no_of_special_requests > 1.50 | | | | | | | | | | |--- room_type_reserved_Room_Type_4 <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- room_type_reserved_Room_Type_4 > 0.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | |--- no_of_special_requests > 2.50 | | | | |--- market_segment_type_Online <= 0.50 | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | |--- market_segment_type_Online > 0.50 | | | | | |--- weights: [4.35, 0.00] class: 0
print(
pd.DataFrame(
model.feature_importances_, columns=["Imp"], index=X_train.columns
).sort_values(by="Imp", ascending=False)
)
Imp lead_time 0.288635 avg_price_per_room 0.133684 no_of_special_requests 0.131801 arrival_month 0.089230 arrival_date 0.088530 market_segment_type_Online 0.085939 no_of_week_nights 0.046036 no_of_weekend_nights 0.038374 no_of_adults 0.022394 arrival_year 0.017481 room_type_reserved_Room_Type_4 0.009396 required_car_parking_space 0.008798 market_segment_type_Offline 0.008335 market_segment_type_Corporate 0.007583 type_of_meal_plan_Not_Selected 0.007501 repeated_guest 0.005043 no_of_children 0.004618 type_of_meal_plan_Meal_Plan_2 0.002689 room_type_reserved_Room_Type_2 0.002212 room_type_reserved_Room_Type_5 0.000676 room_type_reserved_Room_Type_6 0.000340 room_type_reserved_Room_Type_7 0.000239 no_of_previous_bookings_not_canceled 0.000219 room_type_reserved_Room_Type_3 0.000127 no_of_previous_cancellations 0.000120 market_segment_type_Complementary 0.000000 type_of_meal_plan_Meal_Plan_3 0.000000
Before pruning the tree let's check the important features.
feature_names = list(X_train.columns)
importances = model.feature_importances_
indices = np.argsort(importances)
plt.figure(figsize=(8, 8))
plt.title("Feature Importances")
plt.barh(range(len(indices)), importances[indices], color="violet", align="center")
plt.yticks(range(len(indices)), [feature_names[i] for i in indices])
plt.xlabel("Relative Importance")
plt.show()
Pre-Pruning
# Choose the type of classifier.
estimator = DecisionTreeClassifier(random_state=1, class_weight={0: 0.15, 1: 0.85})
# Grid of parameters to choose from
parameters = {
"max_depth": [5, 10, 15, None],
"criterion": ["entropy", "gini"],
"splitter": ["best", "random"],
"min_impurity_decrease": [0.00001, 0.0001, 0.01],
}
# Type of scoring used to compare parameter combinations
acc_scorer = make_scorer(f1_score)
# Run the grid search
grid_obj = GridSearchCV(estimator, parameters, scoring=scorer, cv=5)
grid_obj = grid_obj.fit(X_train, y_train)
# Set the clf to the best combination of parameters
estimator = grid_obj.best_estimator_
# Fit the best algorithm to the data.
estimator.fit(X_train, y_train)
DecisionTreeClassifier(class_weight={0: 0.15, 1: 0.85}, max_depth=5,
min_impurity_decrease=1e-05, random_state=1,
splitter='random')
confusion_matrix_sklearn(estimator, X_train, y_train)
decision_tree_tune_perf_train = model_performance_classification_sklearn(
estimator, X_train, y_train
)
decision_tree_tune_perf_train
# print("Hyperpaarmeters:", decision_tree_tune_perf_train)
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.599913 | 0.960899 | 0.449743 | 0.61271 |
confusion_matrix_sklearn(estimator, X_test, y_test)
decision_tree_tune_perf_test = model_performance_classification_sklearn(
estimator, X_test, y_test
)
decision_tree_tune_perf_test
# print("Hyperpaarmeters:", decision_tree_perf_test)
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.599467 | 0.953152 | 0.444577 | 0.60634 |
plt.figure(figsize=(15, 10))
out = tree.plot_tree(
estimator,
feature_names=feature_names,
filled=True,
fontsize=9,
node_ids=False,
class_names=None,
)
for o in out:
arrow = o.arrow_patch
if arrow is not None:
arrow.set_edgecolor("black")
arrow.set_linewidth(1)
plt.show()
print(tree.export_text(estimator, feature_names=feature_names, show_weights=True))
|--- lead_time <= 66.67 | |--- market_segment_type_Online <= 0.01 | | |--- repeated_guest <= 0.07 | | | |--- arrival_month <= 9.95 | | | | |--- market_segment_type_Offline <= 0.81 | | | | | |--- weights: [93.00, 92.65] class: 0 | | | | |--- market_segment_type_Offline > 0.81 | | | | | |--- weights: [226.50, 86.70] class: 0 | | | |--- arrival_month > 9.95 | | | | |--- market_segment_type_Offline <= 0.21 | | | | | |--- weights: [52.20, 19.55] class: 0 | | | | |--- market_segment_type_Offline > 0.21 | | | | | |--- weights: [162.45, 8.50] class: 0 | | |--- repeated_guest > 0.07 | | | |--- lead_time <= 5.61 | | | | |--- weights: [54.75, 0.00] class: 0 | | | |--- lead_time > 5.61 | | | | |--- lead_time <= 10.30 | | | | | |--- weights: [12.60, 0.00] class: 0 | | | | |--- lead_time > 10.30 | | | | | |--- weights: [16.65, 1.70] class: 0 | |--- market_segment_type_Online > 0.01 | | |--- no_of_special_requests <= 1.88 | | | |--- no_of_special_requests <= 0.92 | | | | |--- arrival_year <= 2017.94 | | | | | |--- weights: [67.95, 36.55] class: 0 | | | | |--- arrival_year > 2017.94 | | | | | |--- weights: [222.30, 1316.65] class: 1 | | | |--- no_of_special_requests > 0.92 | | | | |--- required_car_parking_space <= 0.75 | | | | | |--- weights: [462.60, 508.30] class: 1 | | | | |--- required_car_parking_space > 0.75 | | | | | |--- weights: [27.45, 0.00] class: 0 | | |--- no_of_special_requests > 1.88 | | | |--- no_of_week_nights <= 2.59 | | | | |--- weights: [194.85, 0.00] class: 0 | | | |--- no_of_week_nights > 2.59 | | | | |--- no_of_week_nights <= 4.26 | | | | | |--- weights: [86.55, 14.45] class: 0 | | | | |--- no_of_week_nights > 4.26 | | | | | |--- weights: [12.75, 10.20] class: 0 |--- lead_time > 66.67 | |--- lead_time <= 130.98 | | |--- no_of_special_requests <= 2.05 | | | |--- no_of_special_requests <= 1.27 | | | | |--- no_of_special_requests <= 0.31 | | | | | |--- weights: [241.05, 1231.65] class: 1 | | | | |--- no_of_special_requests > 0.31 | | | | | |--- weights: [196.65, 317.05] class: 1 | | | |--- no_of_special_requests > 1.27 | | | | |--- lead_time <= 89.94 | | | | | |--- weights: [39.00, 7.65] class: 0 | | | | |--- lead_time > 89.94 | | | | | |--- weights: [43.80, 62.05] class: 1 | | |--- no_of_special_requests > 2.05 | | | |--- weights: [20.85, 0.00] class: 0 | |--- lead_time > 130.98 | | |--- no_of_special_requests <= 2.87 | | | |--- arrival_year <= 2017.95 | | | | |--- market_segment_type_Offline <= 0.33 | | | | | |--- weights: [7.05, 77.35] class: 1 | | | | |--- market_segment_type_Offline > 0.33 | | | | | |--- weights: [56.10, 113.05] class: 1 | | | |--- arrival_year > 2017.95 | | | | |--- no_of_special_requests <= 0.01 | | | | | |--- weights: [113.55, 2112.25] class: 1 | | | | |--- no_of_special_requests > 0.01 | | | | | |--- weights: [131.70, 1092.25] class: 1 | | |--- no_of_special_requests > 2.87 | | | |--- weights: [12.00, 0.00] class: 0
# importance of features in the tree building ( The importance of a feature is computed as the
# (normalized) total reduction of the 'criterion' brought by that feature. It is also known as the Gini importance )
print(
pd.DataFrame(
estimator.feature_importances_, columns=["Imp"], index=X_train.columns
).sort_values(by="Imp", ascending=False)
)
# Here we will see that importance of features has increased
Imp lead_time 0.386485 no_of_special_requests 0.356199 market_segment_type_Online 0.147447 arrival_year 0.050201 market_segment_type_Offline 0.018112 arrival_month 0.015002 required_car_parking_space 0.011297 repeated_guest 0.008009 no_of_week_nights 0.007248 room_type_reserved_Room_Type_2 0.000000 market_segment_type_Corporate 0.000000 market_segment_type_Complementary 0.000000 room_type_reserved_Room_Type_7 0.000000 room_type_reserved_Room_Type_6 0.000000 room_type_reserved_Room_Type_5 0.000000 room_type_reserved_Room_Type_4 0.000000 room_type_reserved_Room_Type_3 0.000000 type_of_meal_plan_Meal_Plan_3 0.000000 type_of_meal_plan_Not_Selected 0.000000 type_of_meal_plan_Meal_Plan_2 0.000000 no_of_children 0.000000 avg_price_per_room 0.000000 no_of_previous_bookings_not_canceled 0.000000 no_of_previous_cancellations 0.000000 arrival_date 0.000000 no_of_weekend_nights 0.000000 no_of_adults 0.000000
importances = estimator.feature_importances_
indices = np.argsort(importances)
plt.figure(figsize=(12, 12))
plt.title("Feature Importances")
plt.barh(range(len(indices)), importances[indices], color="violet", align="center")
plt.yticks(range(len(indices)), [feature_names[i] for i in indices])
plt.xlabel("Relative Importance")
plt.show()
clf = DecisionTreeClassifier(random_state=1, class_weight={0: 0.15, 1: 0.85})
path = clf.cost_complexity_pruning_path(X_train, y_train)
ccp_alphas, impurities = abs(path.ccp_alphas), path.impurities
pd.DataFrame(path)
| ccp_alphas | impurities | |
|---|---|---|
| 0 | 0.000000e+00 | 0.006774 |
| 1 | 1.034059e-20 | 0.006774 |
| 2 | 1.034059e-20 | 0.006774 |
| 3 | 1.034059e-20 | 0.006774 |
| 4 | 1.034059e-20 | 0.006774 |
| ... | ... | ... |
| 1828 | 5.456394e-03 | 0.272057 |
| 1829 | 6.138880e-03 | 0.278196 |
| 1830 | 1.459269e-02 | 0.292789 |
| 1831 | 2.518565e-02 | 0.343160 |
| 1832 | 4.577428e-02 | 0.388934 |
1833 rows × 2 columns
fig, ax = plt.subplots(figsize=(10, 5))
ax.plot(ccp_alphas[:-1], impurities[:-1], marker="o", drawstyle="steps-post")
ax.set_xlabel("effective alpha")
ax.set_ylabel("total impurity of leaves")
ax.set_title("Total Impurity vs effective alpha for training set")
plt.show()
Next, we train a decision tree using effective alphas. The last value in ccp_alphas is the alpha value that prunes the whole tree, leaving the tree, clfs[-1], with one node.
clfs = []
for ccp_alpha in ccp_alphas:
clf = DecisionTreeClassifier(
random_state=1, ccp_alpha=ccp_alpha, class_weight={0: 0.15, 1: 0.85}
)
clf.fit(X_train, y_train)
clfs.append(clf)
print(
"Number of nodes in the last tree is: {} with ccp_alpha: {}".format(
clfs[-1].tree_.node_count, ccp_alphas[-1]
)
)
Number of nodes in the last tree is: 1 with ccp_alpha: 0.04577428273915718
clfs = clfs[:-1]
ccp_alphas = ccp_alphas[:-1]
node_counts = [clf.tree_.node_count for clf in clfs]
depth = [clf.tree_.max_depth for clf in clfs]
fig, ax = plt.subplots(2, 1, figsize=(10, 7))
ax[0].plot(ccp_alphas, node_counts, marker="o", drawstyle="steps-post")
ax[0].set_xlabel("alpha")
ax[0].set_ylabel("number of nodes")
ax[0].set_title("Number of nodes vs alpha")
ax[1].plot(ccp_alphas, depth, marker="o", drawstyle="steps-post")
ax[1].set_xlabel("alpha")
ax[1].set_ylabel("depth of tree")
ax[1].set_title("Depth vs alpha")
fig.tight_layout()
f1_train = []
for clf in clfs:
pred_train = clf.predict(X_train)
values_train = f1_score(y_train, pred_train)
f1_train.append(values_train)
f1_test = []
for clf in clfs:
pred_test = clf.predict(X_test)
values_test = f1_score(y_test, pred_test)
f1_test.append(values_test)
fig, ax = plt.subplots(figsize=(15, 5))
ax.set_xlabel("alpha")
ax.set_ylabel("F1 Score")
ax.set_title("F1 Score vs alpha for training and testing sets")
ax.plot(ccp_alphas, f1_train, marker="o", label="train", drawstyle="steps-post")
ax.plot(ccp_alphas, f1_test, marker="o", label="test", drawstyle="steps-post")
ax.legend()
plt.show()
index_best_model = np.argmax(f1_test)
best_model_1 = clfs[index_best_model]
print(best_model_1)
DecisionTreeClassifier(ccp_alpha=2.4753365544464932e-05,
class_weight={0: 0.15, 1: 0.85}, random_state=1)
best_model_1.fit(X_train, y_train)
DecisionTreeClassifier(ccp_alpha=2.4753365544464932e-05,
class_weight={0: 0.15, 1: 0.85}, random_state=1)
confusion_matrix_sklearn(best_model_1, X_train, y_train)
decision_tree_post_perf_train = model_performance_classification_sklearn(
best_model_1, X_train, y_train
)
decision_tree_post_perf_train
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.973614 | 0.997728 | 0.927626 | 0.961401 |
confusion_matrix_sklearn(best_model_1, X_test, y_test)
decision_tree_post_perf_test = model_performance_classification_sklearn(
best_model_1, X_test, y_test
)
decision_tree_perf_test
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.867867 | 0.803521 | 0.791387 | 0.797408 |
plt.figure(figsize=(20, 10))
out = tree.plot_tree(
best_model_1,
feature_names=feature_names,
filled=True,
fontsize=9,
node_ids=False,
class_names=None,
)
for o in out:
arrow = o.arrow_patch
if arrow is not None:
arrow.set_edgecolor("black")
arrow.set_linewidth(1)
plt.show()
print(tree.export_text(best_model_1, feature_names=feature_names, show_weights=True))
|--- lead_time <= 90.50 | |--- no_of_special_requests <= 1.50 | | |--- market_segment_type_Online <= 0.50 | | | |--- no_of_weekend_nights <= 0.50 | | | | |--- market_segment_type_Corporate <= 0.50 | | | | | |--- avg_price_per_room <= 178.44 | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | |--- avg_price_per_room <= 92.50 | | | | | | | | |--- weights: [25.65, 0.00] class: 0 | | | | | | | |--- avg_price_per_room > 92.50 | | | | | | | | |--- arrival_date <= 23.50 | | | | | | | | | |--- arrival_date <= 19.50 | | | | | | | | | | |--- lead_time <= 3.50 | | | | | | | | | | | |--- weights: [1.35, 3.40] class: 1 | | | | | | | | | | |--- lead_time > 3.50 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 19.50 | | | | | | | | | | |--- weights: [1.05, 0.00] class: 0 | | | | | | | | |--- arrival_date > 23.50 | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | |--- no_of_previous_bookings_not_canceled <= 0.50 | | | | | | | | | | | |--- weights: [0.00, 4.25] class: 1 | | | | | | | | | | |--- no_of_previous_bookings_not_canceled > 0.50 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | |--- weights: [267.90, 0.00] class: 0 | | | | | |--- avg_price_per_room > 178.44 | | | | | | |--- arrival_date <= 25.50 | | | | | | | |--- lead_time <= 18.00 | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | |--- lead_time > 18.00 | | | | | | | | |--- weights: [0.00, 13.60] class: 1 | | | | | | |--- arrival_date > 25.50 | | | | | | | |--- weights: [0.60, 0.00] class: 0 | | | | |--- market_segment_type_Corporate > 0.50 | | | | | |--- repeated_guest <= 0.50 | | | | | | |--- no_of_adults <= 1.50 | | | | | | | |--- lead_time <= 19.50 | | | | | | | | |--- lead_time <= 5.50 | | | | | | | | | |--- lead_time <= 3.50 | | | | | | | | | | |--- avg_price_per_room <= 161.09 | | | | | | | | | | | |--- truncated branch of depth 13 | | | | | | | | | | |--- avg_price_per_room > 161.09 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- lead_time > 3.50 | | | | | | | | | | |--- avg_price_per_room <= 82.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- avg_price_per_room > 82.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | |--- lead_time > 5.50 | | | | | | | | | |--- avg_price_per_room <= 70.00 | | | | | | | | | | |--- avg_price_per_room <= 66.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- avg_price_per_room > 66.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- avg_price_per_room > 70.00 | | | | | | | | | | |--- arrival_date <= 29.00 | | | | | | | | | | | |--- weights: [16.95, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 29.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | |--- lead_time > 19.50 | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | |--- lead_time <= 48.50 | | | | | | | | | | |--- no_of_special_requests <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | | |--- no_of_special_requests > 0.50 | | | | | | | | | | | |--- weights: [0.90, 0.00] class: 0 | | | | | | | | | |--- lead_time > 48.50 | | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | | |--- weights: [4.65, 0.85] class: 0 | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | |--- weights: [2.70, 0.00] class: 0 | | | | | | |--- no_of_adults > 1.50 | | | | | | | |--- arrival_date <= 7.50 | | | | | | | | |--- weights: [5.85, 0.00] class: 0 | | | | | | | |--- arrival_date > 7.50 | | | | | | | | |--- arrival_date <= 23.50 | | | | | | | | | |--- arrival_date <= 13.50 | | | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | | | |--- weights: [1.95, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 13.50 | | | | | | | | | | |--- arrival_date <= 18.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- arrival_date > 18.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | |--- arrival_date > 23.50 | | | | | | | | | |--- avg_price_per_room <= 182.50 | | | | | | | | | | |--- arrival_date <= 24.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_date > 24.50 | | | | | | | | | | | |--- weights: [3.75, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 182.50 | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | |--- repeated_guest > 0.50 | | | | | | |--- weights: [38.10, 0.00] class: 0 | | | |--- no_of_weekend_nights > 0.50 | | | | |--- lead_time <= 68.50 | | | | | |--- arrival_month <= 9.50 | | | | | | |--- no_of_special_requests <= 0.50 | | | | | | | |--- avg_price_per_room <= 63.29 | | | | | | | | |--- arrival_date <= 20.50 | | | | | | | | | |--- type_of_meal_plan_Not_Selected <= 0.50 | | | | | | | | | | |--- weights: [8.40, 0.00] class: 0 | | | | | | | | | |--- type_of_meal_plan_Not_Selected > 0.50 | | | | | | | | | | |--- weights: [0.15, 1.70] class: 1 | | | | | | | | |--- arrival_date > 20.50 | | | | | | | | | |--- no_of_week_nights <= 0.50 | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | |--- no_of_week_nights > 0.50 | | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | |--- avg_price_per_room > 63.29 | | | | | | | | |--- no_of_weekend_nights <= 3.50 | | | | | | | | | |--- arrival_month <= 2.50 | | | | | | | | | | |--- avg_price_per_room <= 73.00 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- avg_price_per_room > 73.00 | | | | | | | | | | | |--- weights: [9.60, 0.00] class: 0 | | | | | | | | | |--- arrival_month > 2.50 | | | | | | | | | | |--- lead_time <= 59.50 | | | | | | | | | | | |--- truncated branch of depth 14 | | | | | | | | | | |--- lead_time > 59.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | |--- no_of_weekend_nights > 3.50 | | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | | |--- weights: [0.00, 8.50] class: 1 | | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | |--- no_of_special_requests > 0.50 | | | | | | | |--- avg_price_per_room <= 127.00 | | | | | | | | |--- weights: [35.10, 0.00] class: 0 | | | | | | | |--- avg_price_per_room > 127.00 | | | | | | | | |--- market_segment_type_Corporate <= 0.50 | | | | | | | | | |--- weights: [2.10, 0.00] class: 0 | | | | | | | | |--- market_segment_type_Corporate > 0.50 | | | | | | | | | |--- weights: [0.00, 2.55] class: 1 | | | | | |--- arrival_month > 9.50 | | | | | | |--- no_of_week_nights <= 11.50 | | | | | | | |--- lead_time <= 65.50 | | | | | | | | |--- avg_price_per_room <= 66.75 | | | | | | | | | |--- lead_time <= 3.50 | | | | | | | | | | |--- arrival_month <= 10.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- arrival_month > 10.50 | | | | | | | | | | | |--- weights: [3.90, 0.00] class: 0 | | | | | | | | | |--- lead_time > 3.50 | | | | | | | | | | |--- weights: [27.90, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 66.75 | | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | | |--- arrival_date <= 6.50 | | | | | | | | | | | |--- weights: [4.35, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 6.50 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | | |--- lead_time <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- lead_time > 0.50 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | |--- lead_time > 65.50 | | | | | | | | |--- arrival_month <= 10.50 | | | | | | | | | |--- weights: [0.90, 2.55] class: 1 | | | | | | | | |--- arrival_month > 10.50 | | | | | | | | | |--- weights: [2.10, 0.00] class: 0 | | | | | | |--- no_of_week_nights > 11.50 | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | |--- lead_time > 68.50 | | | | | |--- avg_price_per_room <= 99.98 | | | | | | |--- arrival_month <= 3.50 | | | | | | | |--- avg_price_per_room <= 62.50 | | | | | | | | |--- weights: [3.60, 0.00] class: 0 | | | | | | | |--- avg_price_per_room > 62.50 | | | | | | | | |--- arrival_date <= 25.50 | | | | | | | | | |--- arrival_date <= 14.50 | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 14.50 | | | | | | | | | | |--- avg_price_per_room <= 87.00 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- avg_price_per_room > 87.00 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | |--- arrival_date > 25.50 | | | | | | | | | |--- weights: [0.90, 0.00] class: 0 | | | | | | |--- arrival_month > 3.50 | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | |--- arrival_date <= 25.50 | | | | | | | | | |--- lead_time <= 88.50 | | | | | | | | | | |--- weights: [12.60, 0.00] class: 0 | | | | | | | | | |--- lead_time > 88.50 | | | | | | | | | | |--- arrival_month <= 8.50 | | | | | | | | | | | |--- weights: [1.05, 0.00] class: 0 | | | | | | | | | | |--- arrival_month > 8.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- arrival_date > 25.50 | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | |--- avg_price_per_room <= 80.50 | | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | | | |--- avg_price_per_room > 80.50 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | |--- no_of_special_requests <= 0.50 | | | | | | | | | |--- lead_time <= 73.50 | | | | | | | | | | |--- weights: [0.00, 2.55] class: 1 | | | | | | | | | |--- lead_time > 73.50 | | | | | | | | | | |--- lead_time <= 81.50 | | | | | | | | | | | |--- weights: [2.70, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 81.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | |--- no_of_special_requests > 0.50 | | | | | | | | | |--- weights: [4.20, 0.00] class: 0 | | | | | |--- avg_price_per_room > 99.98 | | | | | | |--- arrival_year <= 2017.50 | | | | | | | |--- weights: [1.80, 0.00] class: 0 | | | | | | |--- arrival_year > 2017.50 | | | | | | | |--- no_of_special_requests <= 0.50 | | | | | | | | |--- avg_price_per_room <= 132.43 | | | | | | | | | |--- arrival_month <= 2.50 | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | |--- arrival_month > 2.50 | | | | | | | | | | |--- no_of_children <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | | | | |--- no_of_children > 0.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 132.43 | | | | | | | | | |--- weights: [1.35, 0.00] class: 0 | | | | | | | |--- no_of_special_requests > 0.50 | | | | | | | | |--- weights: [1.50, 0.00] class: 0 | | |--- market_segment_type_Online > 0.50 | | | |--- no_of_special_requests <= 0.50 | | | | |--- lead_time <= 3.50 | | | | | |--- avg_price_per_room <= 202.67 | | | | | | |--- arrival_month <= 8.50 | | | | | | | |--- arrival_month <= 1.50 | | | | | | | | |--- weights: [10.05, 0.00] class: 0 | | | | | | | |--- arrival_month > 1.50 | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | |--- avg_price_per_room <= 74.57 | | | | | | | | | | |--- weights: [5.55, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 74.57 | | | | | | | | | | |--- avg_price_per_room <= 164.33 | | | | | | | | | | | |--- truncated branch of depth 15 | | | | | | | | | | |--- avg_price_per_room > 164.33 | | | | | | | | | | | |--- weights: [3.30, 0.00] class: 0 | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | |--- lead_time <= 0.50 | | | | | | | | | | |--- weights: [2.25, 0.00] class: 0 | | | | | | | | | |--- lead_time > 0.50 | | | | | | | | | | |--- arrival_date <= 26.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- arrival_date > 26.50 | | | | | | | | | | | |--- weights: [0.00, 11.90] class: 1 | | | | | | |--- arrival_month > 8.50 | | | | | | | |--- avg_price_per_room <= 169.67 | | | | | | | | |--- avg_price_per_room <= 94.66 | | | | | | | | | |--- weights: [11.70, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 94.66 | | | | | | | | | |--- arrival_month <= 10.50 | | | | | | | | | | |--- arrival_date <= 28.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- arrival_date > 28.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- arrival_month > 10.50 | | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | | |--- weights: [1.80, 0.00] class: 0 | | | | | | | |--- avg_price_per_room > 169.67 | | | | | | | | |--- avg_price_per_room <= 182.25 | | | | | | | | | |--- arrival_date <= 11.00 | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 11.00 | | | | | | | | | | |--- arrival_date <= 24.50 | | | | | | | | | | | |--- weights: [0.15, 2.55] class: 1 | | | | | | | | | | |--- arrival_date > 24.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 182.25 | | | | | | | | | |--- weights: [1.05, 0.00] class: 0 | | | | | |--- avg_price_per_room > 202.67 | | | | | | |--- arrival_month <= 11.00 | | | | | | | |--- weights: [0.00, 12.75] class: 1 | | | | | | |--- arrival_month > 11.00 | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | |--- lead_time > 3.50 | | | | | |--- arrival_year <= 2017.50 | | | | | | |--- lead_time <= 62.50 | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | |--- arrival_date <= 7.50 | | | | | | | | | |--- weights: [3.30, 0.00] class: 0 | | | | | | | | |--- arrival_date > 7.50 | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | |--- no_of_week_nights <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- no_of_week_nights > 0.50 | | | | | | | | | | | |--- weights: [3.30, 0.00] class: 0 | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | |--- avg_price_per_room <= 196.89 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | | |--- avg_price_per_room > 196.89 | | | | | | | | | | | |--- weights: [0.00, 3.40] class: 1 | | | | | | | |--- arrival_month > 9.50 | | | | | | | | |--- avg_price_per_room <= 129.92 | | | | | | | | | |--- lead_time <= 26.50 | | | | | | | | | | |--- weights: [19.95, 0.00] class: 0 | | | | | | | | | |--- lead_time > 26.50 | | | | | | | | | | |--- arrival_date <= 6.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- arrival_date > 6.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | |--- avg_price_per_room > 129.92 | | | | | | | | | |--- avg_price_per_room <= 145.50 | | | | | | | | | | |--- lead_time <= 9.50 | | | | | | | | | | | |--- weights: [0.45, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 9.50 | | | | | | | | | | | |--- weights: [0.15, 1.70] class: 1 | | | | | | | | | |--- avg_price_per_room > 145.50 | | | | | | | | | | |--- weights: [1.35, 0.00] class: 0 | | | | | | |--- lead_time > 62.50 | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | |--- avg_price_per_room <= 47.63 | | | | | | | | | |--- weights: [0.60, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 47.63 | | | | | | | | | |--- no_of_children <= 0.50 | | | | | | | | | | |--- avg_price_per_room <= 78.62 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- avg_price_per_room > 78.62 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- no_of_children > 0.50 | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | |--- arrival_date > 27.50 | | | | | | | | |--- weights: [1.65, 0.00] class: 0 | | | | | |--- arrival_year > 2017.50 | | | | | | |--- arrival_month <= 1.50 | | | | | | | |--- lead_time <= 24.50 | | | | | | | | |--- weights: [14.40, 0.00] class: 0 | | | | | | | |--- lead_time > 24.50 | | | | | | | | |--- avg_price_per_room <= 57.94 | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 57.94 | | | | | | | | | |--- arrival_date <= 9.50 | | | | | | | | | | |--- avg_price_per_room <= 142.12 | | | | | | | | | | | |--- weights: [0.90, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 142.12 | | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | | |--- arrival_date > 9.50 | | | | | | | | | | |--- arrival_date <= 30.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- arrival_date > 30.00 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | |--- arrival_month > 1.50 | | | | | | | |--- avg_price_per_room <= 61.67 | | | | | | | | |--- avg_price_per_room <= 47.20 | | | | | | | | | |--- weights: [3.15, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 47.20 | | | | | | | | | |--- lead_time <= 51.50 | | | | | | | | | | |--- avg_price_per_room <= 57.43 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- avg_price_per_room > 57.43 | | | | | | | | | | | |--- weights: [0.60, 0.00] class: 0 | | | | | | | | | |--- lead_time > 51.50 | | | | | | | | | | |--- weights: [2.10, 0.00] class: 0 | | | | | | | |--- avg_price_per_room > 61.67 | | | | | | | | |--- required_car_parking_space <= 0.50 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- lead_time <= 9.50 | | | | | | | | | | | |--- truncated branch of depth 15 | | | | | | | | | | |--- lead_time > 9.50 | | | | | | | | | | | |--- truncated branch of depth 33 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- lead_time <= 24.50 | | | | | | | | | | | |--- weights: [16.95, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 24.50 | | | | | | | | | | | |--- truncated branch of depth 11 | | | | | | | | |--- required_car_parking_space > 0.50 | | | | | | | | | |--- avg_price_per_room <= 204.50 | | | | | | | | | | |--- weights: [6.00, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 204.50 | | | | | | | | | | |--- weights: [0.00, 1.70] class: 1 | | | |--- no_of_special_requests > 0.50 | | | | |--- arrival_month <= 11.50 | | | | | |--- lead_time <= 6.50 | | | | | | |--- avg_price_per_room <= 157.64 | | | | | | | |--- no_of_week_nights <= 10.00 | | | | | | | | |--- lead_time <= 4.50 | | | | | | | | | |--- room_type_reserved_Room_Type_2 <= 0.50 | | | | | | | | | | |--- arrival_date <= 4.50 | | | | | | | | | | | |--- weights: [11.70, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 4.50 | | | | | | | | | | | |--- truncated branch of depth 13 | | | | | | | | | |--- room_type_reserved_Room_Type_2 > 0.50 | | | | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | |--- lead_time > 4.50 | | | | | | | | | |--- arrival_date <= 13.50 | | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | | |--- weights: [1.65, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 13.50 | | | | | | | | | | |--- avg_price_per_room <= 138.79 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- avg_price_per_room > 138.79 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | |--- no_of_week_nights > 10.00 | | | | | | | | |--- lead_time <= 4.50 | | | | | | | | | |--- weights: [0.00, 1.70] class: 1 | | | | | | | | |--- lead_time > 4.50 | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | |--- avg_price_per_room > 157.64 | | | | | | | |--- arrival_date <= 18.50 | | | | | | | | |--- arrival_date <= 10.00 | | | | | | | | | |--- arrival_month <= 10.50 | | | | | | | | | | |--- weights: [3.45, 0.00] class: 0 | | | | | | | | | |--- arrival_month > 10.50 | | | | | | | | | | |--- lead_time <= 4.00 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 4.00 | | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | |--- arrival_date > 10.00 | | | | | | | | | |--- arrival_date <= 16.50 | | | | | | | | | | |--- avg_price_per_room <= 186.67 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- avg_price_per_room > 186.67 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- arrival_date > 16.50 | | | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | |--- arrival_date > 18.50 | | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | | |--- lead_time <= 1.50 | | | | | | | | | | |--- arrival_date <= 20.50 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 20.50 | | | | | | | | | | | |--- weights: [0.30, 1.70] class: 1 | | | | | | | | | |--- lead_time > 1.50 | | | | | | | | | | |--- weights: [1.80, 0.00] class: 0 | | | | | | | | |--- arrival_month > 5.50 | | | | | | | | | |--- weights: [5.25, 0.00] class: 0 | | | | | |--- lead_time > 6.50 | | | | | | |--- required_car_parking_space <= 0.50 | | | | | | | |--- arrival_month <= 1.50 | | | | | | | | |--- weights: [15.60, 0.00] class: 0 | | | | | | | |--- arrival_month > 1.50 | | | | | | | | |--- avg_price_per_room <= 121.78 | | | | | | | | | |--- no_of_week_nights <= 4.50 | | | | | | | | | | |--- avg_price_per_room <= 67.36 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- avg_price_per_room > 67.36 | | | | | | | | | | | |--- truncated branch of depth 28 | | | | | | | | | |--- no_of_week_nights > 4.50 | | | | | | | | | | |--- no_of_weekend_nights <= 2.50 | | | | | | | | | | | |--- truncated branch of depth 17 | | | | | | | | | | |--- no_of_weekend_nights > 2.50 | | | | | | | | | | | |--- weights: [0.30, 13.60] class: 1 | | | | | | | | |--- avg_price_per_room > 121.78 | | | | | | | | | |--- arrival_month <= 8.50 | | | | | | | | | | |--- arrival_date <= 19.50 | | | | | | | | | | | |--- truncated branch of depth 21 | | | | | | | | | | |--- arrival_date > 19.50 | | | | | | | | | | | |--- truncated branch of depth 16 | | | | | | | | | |--- arrival_month > 8.50 | | | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | | | |--- truncated branch of depth 22 | | | | | | |--- required_car_parking_space > 0.50 | | | | | | | |--- weights: [19.65, 0.00] class: 0 | | | | |--- arrival_month > 11.50 | | | | | |--- no_of_weekend_nights <= 4.50 | | | | | | |--- weights: [59.40, 0.00] class: 0 | | | | | |--- no_of_weekend_nights > 4.50 | | | | | | |--- weights: [0.00, 0.85] class: 1 | |--- no_of_special_requests > 1.50 | | |--- no_of_week_nights <= 3.50 | | | |--- weights: [318.90, 0.00] class: 0 | | |--- no_of_week_nights > 3.50 | | | |--- no_of_special_requests <= 2.50 | | | | |--- lead_time <= 4.50 | | | | | |--- weights: [4.50, 0.00] class: 0 | | | | |--- lead_time > 4.50 | | | | | |--- arrival_date <= 3.50 | | | | | | |--- weights: [3.30, 0.00] class: 0 | | | | | |--- arrival_date > 3.50 | | | | | | |--- room_type_reserved_Room_Type_4 <= 0.50 | | | | | | | |--- arrival_date <= 19.50 | | | | | | | | |--- avg_price_per_room <= 70.49 | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 70.49 | | | | | | | | | |--- avg_price_per_room <= 93.40 | | | | | | | | | | |--- lead_time <= 15.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- lead_time > 15.00 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | |--- avg_price_per_room > 93.40 | | | | | | | | | | |--- avg_price_per_room <= 107.29 | | | | | | | | | | | |--- weights: [2.70, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 107.29 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | |--- arrival_date > 19.50 | | | | | | | | |--- lead_time <= 20.00 | | | | | | | | | |--- lead_time <= 8.50 | | | | | | | | | | |--- weights: [0.90, 0.00] class: 0 | | | | | | | | | |--- lead_time > 8.50 | | | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | |--- lead_time > 20.00 | | | | | | | | | |--- lead_time <= 54.00 | | | | | | | | | | |--- weights: [4.80, 0.00] class: 0 | | | | | | | | | |--- lead_time > 54.00 | | | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | | | |--- weights: [1.20, 0.00] class: 0 | | | | | | |--- room_type_reserved_Room_Type_4 > 0.50 | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | |--- weights: [4.95, 0.00] class: 0 | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | |--- avg_price_per_room <= 122.10 | | | | | | | | | |--- weights: [1.65, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 122.10 | | | | | | | | | |--- avg_price_per_room <= 153.00 | | | | | | | | | | |--- arrival_date <= 7.00 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 7.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- avg_price_per_room > 153.00 | | | | | | | | | | |--- weights: [0.90, 0.00] class: 0 | | | |--- no_of_special_requests > 2.50 | | | | |--- weights: [10.50, 0.00] class: 0 |--- lead_time > 90.50 | |--- lead_time <= 151.50 | | |--- no_of_special_requests <= 0.50 | | | |--- arrival_month <= 1.50 | | | | |--- weights: [14.55, 0.00] class: 0 | | | |--- arrival_month > 1.50 | | | | |--- market_segment_type_Online <= 0.50 | | | | | |--- arrival_month <= 11.50 | | | | | | |--- lead_time <= 116.50 | | | | | | | |--- lead_time <= 99.50 | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | |--- arrival_date <= 21.50 | | | | | | | | | | |--- arrival_month <= 7.00 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- arrival_month > 7.00 | | | | | | | | | | | |--- weights: [4.50, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 21.50 | | | | | | | | | | |--- no_of_week_nights <= 3.50 | | | | | | | | | | | |--- weights: [0.00, 1.70] class: 1 | | | | | | | | | | |--- no_of_week_nights > 3.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | |--- arrival_date <= 28.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- arrival_date > 28.50 | | | | | | | | | | | |--- weights: [1.65, 0.00] class: 0 | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | |--- type_of_meal_plan_Not_Selected <= 0.50 | | | | | | | | | | | |--- weights: [0.15, 17.85] class: 1 | | | | | | | | | | |--- type_of_meal_plan_Not_Selected > 0.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | |--- lead_time > 99.50 | | | | | | | | |--- avg_price_per_room <= 58.75 | | | | | | | | | |--- weights: [1.35, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 58.75 | | | | | | | | | |--- room_type_reserved_Room_Type_4 <= 0.50 | | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | | | |--- room_type_reserved_Room_Type_4 > 0.50 | | | | | | | | | | |--- lead_time <= 115.50 | | | | | | | | | | | |--- weights: [1.65, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 115.50 | | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | |--- lead_time > 116.50 | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | |--- avg_price_per_room <= 122.00 | | | | | | | | | |--- weights: [13.05, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 122.00 | | | | | | | | | |--- weights: [0.00, 1.70] class: 1 | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | | | |--- avg_price_per_room <= 89.88 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | | |--- avg_price_per_room > 89.88 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- arrival_date > 27.50 | | | | | | | | | | |--- weights: [2.85, 0.85] class: 0 | | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | | |--- avg_price_per_room <= 74.12 | | | | | | | | | | |--- no_of_week_nights <= 3.50 | | | | | | | | | | | |--- weights: [1.05, 0.00] class: 0 | | | | | | | | | | |--- no_of_week_nights > 3.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- avg_price_per_room > 74.12 | | | | | | | | | | |--- lead_time <= 146.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- lead_time > 146.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | |--- arrival_month > 11.50 | | | | | | |--- avg_price_per_room <= 177.83 | | | | | | | |--- weights: [15.60, 0.00] class: 0 | | | | | | |--- avg_price_per_room > 177.83 | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | |--- market_segment_type_Online > 0.50 | | | | | |--- required_car_parking_space <= 0.50 | | | | | | |--- avg_price_per_room <= 71.92 | | | | | | | |--- arrival_date <= 13.50 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | | | |--- weights: [0.00, 1.70] class: 1 | | | | | | | | | |--- arrival_month > 7.50 | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- weights: [3.00, 0.00] class: 0 | | | | | | | |--- arrival_date > 13.50 | | | | | | | | |--- lead_time <= 142.00 | | | | | | | | | |--- avg_price_per_room <= 70.29 | | | | | | | | | | |--- type_of_meal_plan_Not_Selected <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- type_of_meal_plan_Not_Selected > 0.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- avg_price_per_room > 70.29 | | | | | | | | | | |--- weights: [0.45, 0.00] class: 0 | | | | | | | | |--- lead_time > 142.00 | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | |--- avg_price_per_room > 71.92 | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | |--- room_type_reserved_Room_Type_2 <= 0.50 | | | | | | | | | |--- avg_price_per_room <= 114.65 | | | | | | | | | | |--- no_of_children <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 14 | | | | | | | | | | |--- no_of_children > 0.50 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 114.65 | | | | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | | | | |--- weights: [0.15, 25.50] class: 1 | | | | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | |--- room_type_reserved_Room_Type_2 > 0.50 | | | | | | | | | |--- arrival_date <= 20.00 | | | | | | | | | | |--- weights: [1.50, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 20.00 | | | | | | | | | | |--- avg_price_per_room <= 87.89 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 87.89 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | |--- arrival_month > 5.50 | | | | | | | | |--- lead_time <= 91.50 | | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | | |--- weights: [0.30, 7.65] class: 1 | | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | | |--- arrival_date <= 22.50 | | | | | | | | | | | |--- weights: [1.35, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 22.50 | | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | |--- lead_time > 91.50 | | | | | | | | | |--- avg_price_per_room <= 99.38 | | | | | | | | | | |--- avg_price_per_room <= 97.95 | | | | | | | | | | | |--- truncated branch of depth 13 | | | | | | | | | | |--- avg_price_per_room > 97.95 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 99.38 | | | | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | | | | |--- truncated branch of depth 12 | | | | | | | | | | |--- arrival_month > 7.50 | | | | | | | | | | | |--- truncated branch of depth 15 | | | | | |--- required_car_parking_space > 0.50 | | | | | | |--- no_of_week_nights <= 8.00 | | | | | | | |--- weights: [2.85, 0.00] class: 0 | | | | | | |--- no_of_week_nights > 8.00 | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | |--- no_of_special_requests > 0.50 | | | |--- no_of_special_requests <= 2.50 | | | | |--- arrival_month <= 8.50 | | | | | |--- arrival_year <= 2017.50 | | | | | | |--- type_of_meal_plan_Meal_Plan_2 <= 0.50 | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | |--- avg_price_per_room <= 63.45 | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 63.45 | | | | | | | | | |--- required_car_parking_space <= 0.50 | | | | | | | | | | |--- arrival_date <= 1.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 1.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- required_car_parking_space > 0.50 | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | |--- arrival_month > 7.50 | | | | | | | | |--- lead_time <= 110.50 | | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | | |--- weights: [1.65, 0.00] class: 0 | | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | | |--- lead_time <= 99.50 | | | | | | | | | | | |--- weights: [0.00, 1.70] class: 1 | | | | | | | | | | |--- lead_time > 99.50 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | |--- lead_time > 110.50 | | | | | | | | | |--- lead_time <= 149.00 | | | | | | | | | | |--- no_of_week_nights <= 4.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- no_of_week_nights > 4.50 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | |--- lead_time > 149.00 | | | | | | | | | | |--- weights: [0.45, 0.00] class: 0 | | | | | | |--- type_of_meal_plan_Meal_Plan_2 > 0.50 | | | | | | | |--- weights: [0.90, 0.00] class: 0 | | | | | |--- arrival_year > 2017.50 | | | | | | |--- lead_time <= 142.50 | | | | | | | |--- avg_price_per_room <= 200.86 | | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | | |--- avg_price_per_room <= 71.96 | | | | | | | | | | |--- weights: [5.85, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 71.96 | | | | | | | | | | |--- avg_price_per_room <= 72.37 | | | | | | | | | | | |--- weights: [0.00, 3.40] class: 1 | | | | | | | | | | |--- avg_price_per_room > 72.37 | | | | | | | | | | | |--- truncated branch of depth 17 | | | | | | | | |--- arrival_month > 5.50 | | | | | | | | | |--- arrival_date <= 19.50 | | | | | | | | | | |--- avg_price_per_room <= 92.08 | | | | | | | | | | | |--- weights: [7.65, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 92.08 | | | | | | | | | | | |--- truncated branch of depth 11 | | | | | | | | | |--- arrival_date > 19.50 | | | | | | | | | | |--- avg_price_per_room <= 130.95 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | | |--- avg_price_per_room > 130.95 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | |--- avg_price_per_room > 200.86 | | | | | | | | |--- weights: [0.00, 5.95] class: 1 | | | | | | |--- lead_time > 142.50 | | | | | | | |--- avg_price_per_room <= 100.25 | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | |--- weights: [4.05, 0.00] class: 0 | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | |--- room_type_reserved_Room_Type_2 <= 0.50 | | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- room_type_reserved_Room_Type_2 > 0.50 | | | | | | | | | | |--- avg_price_per_room <= 43.31 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 43.31 | | | | | | | | | | | |--- weights: [0.00, 2.55] class: 1 | | | | | | | |--- avg_price_per_room > 100.25 | | | | | | | | |--- required_car_parking_space <= 0.50 | | | | | | | | | |--- arrival_date <= 7.50 | | | | | | | | | | |--- lead_time <= 143.50 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 143.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- arrival_date > 7.50 | | | | | | | | | | |--- lead_time <= 150.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- lead_time > 150.50 | | | | | | | | | | | |--- weights: [0.00, 5.10] class: 1 | | | | | | | | |--- required_car_parking_space > 0.50 | | | | | | | | | |--- weights: [1.50, 0.00] class: 0 | | | | |--- arrival_month > 8.50 | | | | | |--- avg_price_per_room <= 71.12 | | | | | | |--- no_of_week_nights <= 4.50 | | | | | | | |--- arrival_date <= 8.50 | | | | | | | | |--- arrival_date <= 6.00 | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | |--- arrival_date > 6.00 | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | |--- arrival_date > 8.50 | | | | | | | | |--- weights: [4.20, 0.00] class: 0 | | | | | | |--- no_of_week_nights > 4.50 | | | | | | | |--- weights: [0.00, 1.70] class: 1 | | | | | |--- avg_price_per_room > 71.12 | | | | | | |--- required_car_parking_space <= 0.50 | | | | | | | |--- room_type_reserved_Room_Type_2 <= 0.50 | | | | | | | | |--- arrival_date <= 3.50 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | | | |--- weights: [2.25, 0.00] class: 0 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- type_of_meal_plan_Not_Selected <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- type_of_meal_plan_Not_Selected > 0.50 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | |--- arrival_date > 3.50 | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 21 | | | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | | | |--- truncated branch of depth 11 | | | | | | | |--- room_type_reserved_Room_Type_2 > 0.50 | | | | | | | | |--- arrival_date <= 6.50 | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | |--- arrival_date > 6.50 | | | | | | | | | |--- weights: [1.65, 0.00] class: 0 | | | | | | |--- required_car_parking_space > 0.50 | | | | | | | |--- weights: [1.05, 0.00] class: 0 | | | |--- no_of_special_requests > 2.50 | | | | |--- weights: [13.50, 0.00] class: 0 | |--- lead_time > 151.50 | | |--- avg_price_per_room <= 100.04 | | | |--- no_of_special_requests <= 0.50 | | | | |--- no_of_adults <= 1.50 | | | | | |--- market_segment_type_Online <= 0.50 | | | | | | |--- lead_time <= 163.50 | | | | | | | |--- arrival_month <= 5.00 | | | | | | | | |--- weights: [0.60, 0.00] class: 0 | | | | | | | |--- arrival_month > 5.00 | | | | | | | | |--- weights: [0.15, 13.60] class: 1 | | | | | | |--- lead_time > 163.50 | | | | | | | |--- lead_time <= 341.00 | | | | | | | | |--- lead_time <= 173.00 | | | | | | | | | |--- arrival_date <= 3.50 | | | | | | | | | | |--- weights: [9.45, 5.10] class: 0 | | | | | | | | | |--- arrival_date > 3.50 | | | | | | | | | | |--- arrival_month <= 5.00 | | | | | | | | | | | |--- weights: [0.45, 0.00] class: 0 | | | | | | | | | | |--- arrival_month > 5.00 | | | | | | | | | | | |--- weights: [0.00, 7.65] class: 1 | | | | | | | | |--- lead_time > 173.00 | | | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | | | |--- avg_price_per_room <= 88.00 | | | | | | | | | | | |--- weights: [1.35, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 88.00 | | | | | | | | | | | |--- weights: [0.00, 2.55] class: 1 | | | | | | | | | |--- arrival_month > 5.50 | | | | | | | | | | |--- avg_price_per_room <= 98.00 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- avg_price_per_room > 98.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | |--- lead_time > 341.00 | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | |--- avg_price_per_room <= 88.00 | | | | | | | | | | |--- weights: [0.00, 8.50] class: 1 | | | | | | | | | |--- avg_price_per_room > 88.00 | | | | | | | | | | |--- weights: [1.50, 5.10] class: 1 | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | |--- avg_price_per_room <= 80.00 | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 80.00 | | | | | | | | | | |--- weights: [0.45, 1.70] class: 1 | | | | | |--- market_segment_type_Online > 0.50 | | | | | | |--- avg_price_per_room <= 2.50 | | | | | | | |--- lead_time <= 285.50 | | | | | | | | |--- weights: [1.65, 0.00] class: 0 | | | | | | | |--- lead_time > 285.50 | | | | | | | | |--- type_of_meal_plan_Meal_Plan_2 <= 0.50 | | | | | | | | | |--- weights: [0.00, 1.70] class: 1 | | | | | | | | |--- type_of_meal_plan_Meal_Plan_2 > 0.50 | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | |--- avg_price_per_room > 2.50 | | | | | | | |--- weights: [0.15, 54.40] class: 1 | | | | |--- no_of_adults > 1.50 | | | | | |--- arrival_year <= 2017.50 | | | | | | |--- lead_time <= 215.50 | | | | | | | |--- lead_time <= 164.00 | | | | | | | | |--- market_segment_type_Online <= 0.50 | | | | | | | | | |--- lead_time <= 161.00 | | | | | | | | | | |--- weights: [2.10, 0.00] class: 0 | | | | | | | | | |--- lead_time > 161.00 | | | | | | | | | | |--- weights: [1.20, 0.85] class: 0 | | | | | | | | |--- market_segment_type_Online > 0.50 | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | |--- lead_time > 164.00 | | | | | | | | |--- arrival_date <= 9.50 | | | | | | | | | |--- avg_price_per_room <= 74.62 | | | | | | | | | | |--- market_segment_type_Online <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- market_segment_type_Online > 0.50 | | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | | |--- avg_price_per_room > 74.62 | | | | | | | | | | |--- weights: [0.00, 11.90] class: 1 | | | | | | | | |--- arrival_date > 9.50 | | | | | | | | | |--- weights: [0.00, 44.20] class: 1 | | | | | | |--- lead_time > 215.50 | | | | | | | |--- market_segment_type_Online <= 0.50 | | | | | | | | |--- weights: [10.95, 0.00] class: 0 | | | | | | | |--- market_segment_type_Online > 0.50 | | | | | | | | |--- weights: [0.00, 5.10] class: 1 | | | | | |--- arrival_year > 2017.50 | | | | | | |--- avg_price_per_room <= 82.47 | | | | | | | |--- lead_time <= 203.50 | | | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | | | |--- weights: [0.00, 34.85] class: 1 | | | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | | | |--- repeated_guest <= 0.50 | | | | | | | | | | |--- lead_time <= 170.50 | | | | | | | | | | | |--- weights: [5.70, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 170.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- repeated_guest > 0.50 | | | | | | | | | | |--- weights: [0.00, 2.55] class: 1 | | | | | | | |--- lead_time > 203.50 | | | | | | | | |--- arrival_date <= 3.50 | | | | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | | | | |--- no_of_week_nights <= 2.00 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | | |--- no_of_week_nights > 2.00 | | | | | | | | | | | |--- weights: [0.00, 7.65] class: 1 | | | | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | | | | |--- weights: [3.15, 0.00] class: 0 | | | | | | | | |--- arrival_date > 3.50 | | | | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | | | |--- weights: [0.00, 71.40] class: 1 | | | | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | | |--- weights: [2.85, 0.00] class: 0 | | | | | | |--- avg_price_per_room > 82.47 | | | | | | | |--- no_of_adults <= 2.50 | | | | | | | | |--- market_segment_type_Corporate <= 0.50 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- room_type_reserved_Room_Type_4 <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | | |--- room_type_reserved_Room_Type_4 > 0.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- market_segment_type_Online <= 0.50 | | | | | | | | | | | |--- weights: [1.05, 0.00] class: 0 | | | | | | | | | | |--- market_segment_type_Online > 0.50 | | | | | | | | | | | |--- weights: [0.00, 11.90] class: 1 | | | | | | | | |--- market_segment_type_Corporate > 0.50 | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | |--- no_of_adults > 2.50 | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | |--- no_of_special_requests > 0.50 | | | | |--- no_of_weekend_nights <= 0.50 | | | | | |--- lead_time <= 180.50 | | | | | | |--- lead_time <= 159.50 | | | | | | | |--- arrival_month <= 8.50 | | | | | | | | |--- weights: [1.20, 0.00] class: 0 | | | | | | | |--- arrival_month > 8.50 | | | | | | | | |--- avg_price_per_room <= 74.05 | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 74.05 | | | | | | | | | |--- weights: [0.15, 4.25] class: 1 | | | | | | |--- lead_time > 159.50 | | | | | | | |--- arrival_date <= 1.50 | | | | | | | | |--- lead_time <= 176.50 | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | |--- lead_time > 176.50 | | | | | | | | | |--- weights: [0.00, 1.70] class: 1 | | | | | | | |--- arrival_date > 1.50 | | | | | | | | |--- no_of_adults <= 0.50 | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | |--- no_of_adults > 0.50 | | | | | | | | | |--- weights: [7.20, 0.00] class: 0 | | | | | |--- lead_time > 180.50 | | | | | | |--- no_of_special_requests <= 2.50 | | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | | |--- no_of_week_nights <= 0.50 | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | |--- no_of_week_nights > 0.50 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- weights: [0.00, 106.25] class: 1 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | | |--- avg_price_per_room <= 86.22 | | | | | | | | | |--- weights: [1.65, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 86.22 | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | |--- weights: [0.45, 0.00] class: 0 | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | |--- room_type_reserved_Room_Type_4 <= 0.50 | | | | | | | | | | | |--- weights: [0.30, 3.40] class: 1 | | | | | | | | | | |--- room_type_reserved_Room_Type_4 > 0.50 | | | | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | |--- no_of_special_requests > 2.50 | | | | | | | |--- weights: [1.80, 0.00] class: 0 | | | | |--- no_of_weekend_nights > 0.50 | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | |--- arrival_date <= 24.50 | | | | | | | |--- avg_price_per_room <= 74.73 | | | | | | | | |--- lead_time <= 199.50 | | | | | | | | | |--- weights: [4.80, 0.00] class: 0 | | | | | | | | |--- lead_time > 199.50 | | | | | | | | | |--- no_of_special_requests <= 1.50 | | | | | | | | | | |--- no_of_adults <= 0.50 | | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | | | |--- no_of_adults > 0.50 | | | | | | | | | | | |--- weights: [3.00, 0.00] class: 0 | | | | | | | | | |--- no_of_special_requests > 1.50 | | | | | | | | | | |--- avg_price_per_room <= 62.02 | | | | | | | | | | | |--- weights: [0.60, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 62.02 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | |--- avg_price_per_room > 74.73 | | | | | | | | |--- no_of_week_nights <= 6.50 | | | | | | | | | |--- arrival_date <= 1.50 | | | | | | | | | | |--- weights: [1.95, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 1.50 | | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | | |--- truncated branch of depth 15 | | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | |--- no_of_week_nights > 6.50 | | | | | | | | | |--- no_of_special_requests <= 2.50 | | | | | | | | | | |--- arrival_date <= 16.50 | | | | | | | | | | | |--- weights: [0.15, 5.10] class: 1 | | | | | | | | | | |--- arrival_date > 16.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- no_of_special_requests > 2.50 | | | | | | | | | | |--- weights: [0.45, 0.00] class: 0 | | | | | | |--- arrival_date > 24.50 | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | |--- no_of_special_requests <= 1.50 | | | | | | | | | |--- lead_time <= 165.00 | | | | | | | | | | |--- arrival_date <= 28.50 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 28.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- lead_time > 165.00 | | | | | | | | | | |--- room_type_reserved_Room_Type_4 <= 0.50 | | | | | | | | | | | |--- weights: [4.35, 0.00] class: 0 | | | | | | | | | | |--- room_type_reserved_Room_Type_4 > 0.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- no_of_special_requests > 1.50 | | | | | | | | | |--- lead_time <= 197.50 | | | | | | | | | | |--- weights: [0.45, 0.00] class: 0 | | | | | | | | | |--- lead_time > 197.50 | | | | | | | | | | |--- weights: [0.00, 2.55] class: 1 | | | | | | | |--- arrival_month > 7.50 | | | | | | | | |--- no_of_special_requests <= 2.50 | | | | | | | | | |--- avg_price_per_room <= 55.92 | | | | | | | | | | |--- weights: [0.45, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 55.92 | | | | | | | | | | |--- no_of_week_nights <= 0.50 | | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | | |--- no_of_week_nights > 0.50 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | |--- no_of_special_requests > 2.50 | | | | | | | | | |--- weights: [0.90, 0.00] class: 0 | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | |--- lead_time <= 348.50 | | | | | | | |--- no_of_special_requests <= 1.50 | | | | | | | | |--- weights: [19.50, 0.00] class: 0 | | | | | | | |--- no_of_special_requests > 1.50 | | | | | | | | |--- no_of_week_nights <= 3.50 | | | | | | | | | |--- weights: [1.35, 0.00] class: 0 | | | | | | | | |--- no_of_week_nights > 3.50 | | | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | | | |--- weights: [0.30, 1.70] class: 1 | | | | | | | | | |--- arrival_month > 7.50 | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | |--- lead_time > 348.50 | | | | | | | |--- weights: [1.20, 2.55] class: 1 | | |--- avg_price_per_room > 100.04 | | | |--- no_of_special_requests <= 2.50 | | | | |--- arrival_month <= 11.50 | | | | | |--- weights: [0.00, 1791.80] class: 1 | | | | |--- arrival_month > 11.50 | | | | | |--- no_of_special_requests <= 0.50 | | | | | | |--- weights: [7.05, 0.00] class: 0 | | | | | |--- no_of_special_requests > 0.50 | | | | | | |--- arrival_date <= 24.50 | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | |--- arrival_date > 24.50 | | | | | | | |--- lead_time <= 153.50 | | | | | | | | |--- weights: [0.15, 0.00] class: 0 | | | | | | | |--- lead_time > 153.50 | | | | | | | | |--- lead_time <= 172.50 | | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | | |--- weights: [0.30, 0.00] class: 0 | | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | | |--- weights: [0.00, 0.85] class: 1 | | | | | | | | |--- lead_time > 172.50 | | | | | | | | | |--- weights: [0.30, 11.90] class: 1 | | | |--- no_of_special_requests > 2.50 | | | | |--- weights: [4.65, 0.00] class: 0
importances = best_model_1.feature_importances_
indices = np.argsort(importances)
plt.figure(figsize=(12, 12))
plt.title("Feature Importances")
plt.barh(range(len(indices)), importances[indices], color="violet", align="center")
plt.yticks(range(len(indices)), [feature_names[i] for i in indices])
plt.xlabel("Relative Importance")
plt.show()
# training performance comparison
models_train_comp_df = pd.concat(
[
decision_tree_perf_train.T,
decision_tree_tune_perf_train.T,
decision_tree_post_perf_train.T,
],
axis=1,
)
models_train_comp_df.columns = [
"Decision Tree sklearn",
"Decision Tree (Pre-Pruning)",
"Decision Tree (Post-Pruning)",
]
print("Training performance comparison:")
models_train_comp_df
Training performance comparison:
| Decision Tree sklearn | Decision Tree (Pre-Pruning) | Decision Tree (Post-Pruning) | |
|---|---|---|---|
| Accuracy | 0.990312 | 0.599913 | 0.973614 |
| Recall | 0.998326 | 0.960899 | 0.997728 |
| Precision | 0.972964 | 0.449743 | 0.927626 |
| F1 | 0.985482 | 0.612710 | 0.961401 |
# training performance comparison
models_train_comp_df = pd.concat(
[
decision_tree_perf_test.T,
decision_tree_tune_perf_test.T,
decision_tree_post_perf_test.T,
],
axis=1,
)
models_train_comp_df.columns = [
"Decision Tree sklearn",
"Decision Tree (Pre-Pruning)",
"Decision Tree (Post-Pruning)",
]
print("Training performance comparison:")
models_train_comp_df
Training performance comparison:
| Decision Tree sklearn | Decision Tree (Pre-Pruning) | Decision Tree (Post-Pruning) | |
|---|---|---|---|
| Accuracy | 0.867867 | 0.599467 | 0.867132 |
| Recall | 0.803521 | 0.953152 | 0.832765 |
| Precision | 0.791387 | 0.444577 | 0.773879 |
| F1 | 0.797408 | 0.606340 | 0.802243 |
best_model2 = DecisionTreeClassifier(
ccp_alpha=0.0015, class_weight={0: 0.15, 1: 0.85}, random_state=1
)
best_model2.fit(X_train, y_train)
DecisionTreeClassifier(ccp_alpha=0.0015, class_weight={0: 0.15, 1: 0.85},
random_state=1)
confusion_matrix_sklearn(best_model2, X_train, y_train)
decision_tree_post_perf_train_1 = model_performance_classification_sklearn(
best_model2, X_train, y_train
)
decision_tree_post_perf_train_1
# print("Recall Score:", decision_tree_postpruned_perf_train)
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.700614 | 0.931962 | 0.525663 | 0.672186 |
decision_tree_post_perf_test_1 = model_performance_classification_sklearn(
best_model2, X_test, y_test
)
decision_tree_post_perf_test_1
# print("Recall Score:", decision_tree_postpruned_perf_test)
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.695948 | 0.924759 | 0.516902 | 0.663138 |
plt.figure(figsize=(25, 14))
out = tree.plot_tree(
best_model2,
feature_names=feature_names,
filled=True,
fontsize=9,
node_ids=False,
class_names=None,
)
for o in out:
arrow = o.arrow_patch
if arrow is not None:
arrow.set_edgecolor("black")
arrow.set_linewidth(1)
plt.show()
print(tree.export_text(best_model2, feature_names=feature_names, show_weights=True))
|--- lead_time <= 90.50 | |--- no_of_special_requests <= 1.50 | | |--- market_segment_type_Online <= 0.50 | | | |--- no_of_weekend_nights <= 0.50 | | | | |--- market_segment_type_Corporate <= 0.50 | | | | | |--- avg_price_per_room <= 178.44 | | | | | | |--- weights: [297.30, 7.65] class: 0 | | | | | |--- avg_price_per_room > 178.44 | | | | | | |--- weights: [0.75, 13.60] class: 1 | | | | |--- market_segment_type_Corporate > 0.50 | | | | | |--- weights: [118.35, 72.25] class: 0 | | | |--- no_of_weekend_nights > 0.50 | | | | |--- lead_time <= 68.50 | | | | | |--- arrival_month <= 9.50 | | | | | | |--- weights: [147.30, 112.20] class: 0 | | | | | |--- arrival_month > 9.50 | | | | | | |--- weights: [105.00, 16.15] class: 0 | | | | |--- lead_time > 68.50 | | | | | |--- avg_price_per_room <= 99.98 | | | | | | |--- weights: [30.00, 21.25] class: 0 | | | | | |--- avg_price_per_room > 99.98 | | | | | | |--- weights: [6.60, 68.85] class: 1 | | |--- market_segment_type_Online > 0.50 | | | |--- no_of_special_requests <= 0.50 | | | | |--- lead_time <= 3.50 | | | | | |--- weights: [92.25, 80.75] class: 0 | | | | |--- lead_time > 3.50 | | | | | |--- arrival_year <= 2017.50 | | | | | | |--- weights: [48.30, 40.80] class: 0 | | | | | |--- arrival_year > 2017.50 | | | | | | |--- arrival_month <= 1.50 | | | | | | | |--- weights: [16.80, 10.20] class: 0 | | | | | | |--- arrival_month > 1.50 | | | | | | | |--- weights: [165.45, 1563.15] class: 1 | | | |--- no_of_special_requests > 0.50 | | | | |--- arrival_month <= 11.50 | | | | | |--- lead_time <= 6.50 | | | | | | |--- weights: [116.55, 39.95] class: 0 | | | | | |--- lead_time > 6.50 | | | | | | |--- required_car_parking_space <= 0.50 | | | | | | | |--- weights: [370.80, 612.00] class: 1 | | | | | | |--- required_car_parking_space > 0.50 | | | | | | | |--- weights: [19.65, 0.00] class: 0 | | | | |--- arrival_month > 11.50 | | | | | |--- weights: [59.40, 0.85] class: 0 | |--- no_of_special_requests > 1.50 | | |--- no_of_week_nights <= 3.50 | | | |--- weights: [318.90, 0.00] class: 0 | | |--- no_of_week_nights > 3.50 | | | |--- weights: [46.80, 32.30] class: 0 |--- lead_time > 90.50 | |--- lead_time <= 151.50 | | |--- no_of_special_requests <= 0.50 | | | |--- arrival_month <= 1.50 | | | | |--- weights: [14.55, 0.00] class: 0 | | | |--- arrival_month > 1.50 | | | | |--- market_segment_type_Online <= 0.50 | | | | | |--- arrival_month <= 11.50 | | | | | | |--- weights: [93.15, 359.55] class: 1 | | | | | |--- arrival_month > 11.50 | | | | | | |--- weights: [15.60, 0.85] class: 0 | | | | |--- market_segment_type_Online > 0.50 | | | | | |--- weights: [52.80, 657.90] class: 1 | | |--- no_of_special_requests > 0.50 | | | |--- weights: [212.85, 326.40] class: 1 | |--- lead_time > 151.50 | | |--- avg_price_per_room <= 100.04 | | | |--- no_of_special_requests <= 0.50 | | | | |--- no_of_adults <= 1.50 | | | | | |--- market_segment_type_Online <= 0.50 | | | | | | |--- weights: [52.65, 48.45] class: 0 | | | | | |--- market_segment_type_Online > 0.50 | | | | | | |--- weights: [1.95, 56.10] class: 1 | | | | |--- no_of_adults > 1.50 | | | | | |--- weights: [49.50, 951.15] class: 1 | | | |--- no_of_special_requests > 0.50 | | | | |--- no_of_weekend_nights <= 0.50 | | | | | |--- weights: [14.85, 125.80] class: 1 | | | | |--- no_of_weekend_nights > 0.50 | | | | | |--- weights: [73.05, 85.85] class: 1 | | |--- avg_price_per_room > 100.04 | | | |--- weights: [13.20, 1804.55] class: 1
# importance of features in the tree building ( The importance of a feature is computed as the
# (normalized) total reduction of the 'criterion' brought by that feature. It is also known as the Gini importance )
print(
pd.DataFrame(
best_model2.feature_importances_, columns=["Imp"], index=X_train.columns
).sort_values(by="Imp", ascending=False)
)
Imp lead_time 0.368027 no_of_special_requests 0.252270 market_segment_type_Online 0.181183 arrival_month 0.061556 avg_price_per_room 0.037981 no_of_weekend_nights 0.029794 arrival_year 0.019491 no_of_adults 0.014164 market_segment_type_Corporate 0.013949 no_of_week_nights 0.012646 required_car_parking_space 0.008939 no_of_previous_cancellations 0.000000 room_type_reserved_Room_Type_4 0.000000 market_segment_type_Offline 0.000000 arrival_date 0.000000 market_segment_type_Complementary 0.000000 room_type_reserved_Room_Type_7 0.000000 room_type_reserved_Room_Type_6 0.000000 room_type_reserved_Room_Type_5 0.000000 room_type_reserved_Room_Type_3 0.000000 no_of_previous_bookings_not_canceled 0.000000 room_type_reserved_Room_Type_2 0.000000 type_of_meal_plan_Not_Selected 0.000000 type_of_meal_plan_Meal_Plan_2 0.000000 no_of_children 0.000000 repeated_guest 0.000000 type_of_meal_plan_Meal_Plan_3 0.000000
importances = best_model2.feature_importances_
indices = np.argsort(importances)
plt.figure(figsize=(12, 12))
plt.title("Feature Importances")
plt.barh(range(len(indices)), importances[indices], color="violet", align="center")
plt.yticks(range(len(indices)), [feature_names[i] for i in indices])
plt.xlabel("Relative Importance")
plt.show()
# training performance comparison
models_train_comp_df = pd.concat(
[
decision_tree_perf_train.T,
decision_tree_tune_perf_train.T,
decision_tree_post_perf_train_1.T,
],
axis=1,
)
models_train_comp_df.columns = [
"Decision Tree sklearn",
"Decision Tree (Pre-Pruning)",
"Decision Tree (Post-Pruning)",
]
print("Training performance comparison:")
models_train_comp_df
Training performance comparison:
| Decision Tree sklearn | Decision Tree (Pre-Pruning) | Decision Tree (Post-Pruning) | |
|---|---|---|---|
| Accuracy | 0.990312 | 0.599913 | 0.700614 |
| Recall | 0.998326 | 0.960899 | 0.931962 |
| Precision | 0.972964 | 0.449743 | 0.525663 |
| F1 | 0.985482 | 0.612710 | 0.672186 |
# training performance comparison
models_train_comp_df = pd.concat(
[
decision_tree_perf_test.T,
decision_tree_tune_perf_test.T,
decision_tree_post_perf_test_1.T,
],
axis=1,
)
models_train_comp_df.columns = [
"Decision Tree sklearn",
"Decision Tree (Pre-Pruning)",
"Decision Tree (Post-Pruning)",
]
print("Training performance comparison:")
models_train_comp_df
Training performance comparison:
| Decision Tree sklearn | Decision Tree (Pre-Pruning) | Decision Tree (Post-Pruning) | |
|---|---|---|---|
| Accuracy | 0.867867 | 0.599467 | 0.695948 |
| Recall | 0.803521 | 0.953152 | 0.924759 |
| Precision | 0.791387 | 0.444577 | 0.516902 |
| F1 | 0.797408 | 0.606340 | 0.663138 |
The cancelation policies depends on interplay of lead time, type of booking, no of special requests, average price per room, and the no of weekend nights.
When lead time < 90 days, customer does not use online and coporate for booking, only one special request and no weekend nights, the customer will not cancel the booking when average price per room less than 178 euros and cancel the booking when the average price is greater than 178 euros.
Therefore reccomendation is that depending on the lead time the hotel need to adjust the average price per room to avoid cancelation.
To avoid cancellation or resource loss when the average price greater than 178 euros, the management can consider more refund if possible full refund to the customer.
In case of weekend night spend greater equal to one, the cancellation can happen when average price greater than 99 euros. The management can deal with this by recucing the price per room or increase the amount of refund.
The number of special request is also an important factor. If hotel fulfil customer's request, the chances of cancellation will be reduced.
The Logistic regression model shows that more the family members the booking cancellation increases. The hotel management can focus on that by increasing quality and new facilities to attract both children and adults.
The online booking is also a risk factor. The higher cancellation occurs on online. The hotel management need to focus on that to reduce loss of resources. By reducing the price and increasing quality and facilities, the cancellation can be avoided.
For the repeated guest the chances of cancellation decreases. By increasing quality hotel can attract the custor repeatedly.
Hotel can look at the meal plan 2, as the regression model shows the coefficient is positive. They can improve the quality to avoid cancellation.
The model says that more the required car parking space, less is the cancellation. Hotel can increase and improve the car parking space.
Arrival month is also an important factor. The October, September, August have higher cancellation. By reducing the price the hotel can avoid cancellation.